Tag Archives: python

Sleep: What am I trying to measure?

My last post about analysing my sleep data had plenty of caveats, but despite my caution I started to wonder whether I was taking an interest in the right variables.

I’m aiming to sleep better for health and to feel more alert during the day. My first thought was to find out what influences how many hours I sleep each night. This was a guesstimate of my hours of sleep based on roughly when I fell asleep and woke up, minus any trips to the bathroom or time spent starting at the ceiling in frustration. Then I’d compare this to various lifestyle measures like how much I’d eaten, exercise, screen time, etc to see what, if anything correlated with a long sleep. Despite buying a gadget to help measure it, I’m not sure I have a more accurate measure of sleep quality, so approximate time asleep is what I tried.

However, I’ve realised that there are several ways in which “Hours that night” as I call it might not be the most useful measure. For example, there are times when I can’t get a full night’s sleep no matter how well prepared my body is for it. Sometimes I have to get up early for work, to go on holiday or because I have an audax that starts at 6am. Occasionally my daughter is ill and will wake me up several times. These things are thankfully rare, but could skew the results. I could simply delete any results where my maximum possible sleep was less than six hours, but this leaves less extreme cases.

I also recorded the maximum possible hours I could get each night. In my spreadsheet I subtracted the “Hours that night” from this to get “Missed sleep”, thinking that would be a better measure. On the other hand, if I can only get three hours maximum and I miss none, is that really better than having a Saturday lie-in for up to nine hours, but only sleeping for eight, meaning missed sleep is one hour? Who knows how many hours I might have got if I’d tried to sleep for more than three hours?

So I tried working out some kind of scaling adjustment, so that “missing” one hour out of a possible nine gives a better score than missing one hour out of a possible seven. I could ignore anything over eight hours as most people are unlikely to sleep that long unless they’ve missed out on sleep the night before. But that makes a hard cut-off, which feels wrong.

So I’ve come up with a simple scaling algorithm which looks like this:-

def missed_sleep_scaled(row):
    useful_max = min(target_sleep, row['Max possible (hrs)'])
    if useful_max == float(0):
        # result is invalid.
        return -1
    max_expected_hours = min(target_sleep, row['Max possible (hrs)'])
    useful_missed_sleep = max_expected_hours - min(row['Hours that night'], target_sleep)
    if useful_missed_sleep <= hours_noise_threshold:
        useful_missed_reduced_noise = float(0)
    else:
        useful_missed_reduced_noise = useful_missed_sleep
    return float(10) * useful_missed_reduced_noise / useful_max

This “sleep score” correlates less strongly with “Max possible (hrs)” than “missed sleep” did (0.104 vs 0.198). That seems like a step in the right direction. I’m uncertain about whether I should tweak it until it doesn’t correlate with “Max possible (hrs)” at all.

Some sleep correlation data

You may have read my previous post that I’m trying to use data to work out why I’m sometimes not sleeping well and how I might sleep better. I’ve been doing that now for some 86 days and I’m excited enough to look at the data and see if anything interesting has shown up. Ideally I’d like a year’s worth of data to get reliable results, but I’m impatient.

You may be wondering why the title of today’s post is so undramatic, prosaic even. Well, I’m rather a newbie when it comes to data science and I don’t want to leap to conclusions from the first thing I try. As you’ll see from my GitHub project, all I’ve done so far is to read in the data and use Python Pandas to produce the correlation results. I then pasted this into a spreadsheet, sorted and highlighted some rows.

I used to think that correlation implied causation. Then I took a stats class. Now I don't. Sounds like the class helped. Well, maybe.

I’m also wary that correlation does not imply causation. But it does make for an interesting start.

With those caveats out of the way, this is what I’ve got so far.

Screenshot of spreadsheet showing potential influences on the "hours that night" variable.

Plain correlation from the first 86 days of data

The factor I’m hoping to maximise is “Hours that night” – how many hours I sleep on a night after all those potential influences have been measured. So I’m interested in things which might be positive or negative influences on that.

The top two I’ve put in grey, as I think they’re not very interesting, except to show that the correlation function seems to be working as expected.

  • “Max possible” is low when I have to get up very early, say to travel somewhere, so it’s always going to limit my sleep.
  • “Av hrs past 5 days” is a rolling average of “Hours that night” over the last 5 days. That I’m more likely to sleep if I’ve built up a huge sleep debt recently is unsurprising, but also confirms that the model seems reliable.
  • ZMA and FOS are two supplements I’ve been taking recently which are said to help with sleep, the ZMA particularly for those doing a lot of exercise. Evidence is limited and I’m not keen on trying every eccentric treatment “because you never know”, but they’re cheap and the side-effects are trivial. However, I’ve only been taking these for a couple of weeks, so I don’t think there’s enough data to say if they have helped me.

Eating

If I had guessed I would’ve expected “Evening meal finish” – the time at which I finish dinner to have had the greatest negative effect on my sleep as I often wake early feeling boated if I’ve eaten late. It does seem to be a negative factor, along with “Evening meal size (0-5)”. I’ll aim to eat earlier and keep recording results.

Alcohol

This wasn’t a significant factor for me. This is supposed to make you fall asleep later but wake up too early, losing sleep overall. Anecdotally, I have found to be true for me. However, I drink quite rarely and haven’t had more than four units a day in any of the last 86 days, so my stats so far may not say much about that.

Daylight, Sugar, Screen time, Fasting

I’m recording these as they’ve either been blamed for bad sleep or hailed as a helpful thing. They don’t seem to be a big deal for me. I may consider stopping recording them so I have more time/space for other data.

Worry, Excitement

It interesting that these have some negative effect on my sleep, but as they’re all day values, there’s probably not a huge amount I can do to control them.

Exercise

I am surprised and, if I’m honest, disappointed to see “Exercise (1-5)” as such an apparently bad influence on my sleep. Studies have suggested that exercise should have a positive effect on sleep, but that may depend on intensity.

For me, and exercise score of 1 indicates a day where I didn’t walk for more than 15 mins and did no other exercise, 2 a normal day where I cycle to/from the station, about ten minutes each way, 3 is a bike ride of up to 3 hours or a 20-min weights/callisthenics session, 4 is a 3-6 hour bike ride, 5 is reserved for the all-day and sometimes all-night rides I occasionally do.

Perhaps this isn’t enough data on what, for me, may be an important question. Questions I’d like to answer might include.

  • Is morning exercise better or worse for sleep than evening exercise?
  • Is moderate exercise better for sleep than either extreme?
  • Does taking ZMA (or something else) mitigate the apparently negative effects of exercise on sleep?

Finally, instead of simply “Hours that night” should I be measuring the sleep I got as a fraction of the “Max possible” sleep? That might account for strange circumstances where I was still cycling at 1am and inevitably scored a 5 for “Exercise”.

Conclusions

I shouldn’t be drawing any firm conclusions yet, I think. 86 days is not that much data and there are many confounding factors that could be influencing things. I have a lot of thinking, learning and tweaking to do.

I plan to keep recording the data, expand my exercise data to include AM/PM and separate short intense efforts from longer endurance ones.