Saturday, January 18, 2014

Insight from Fitbit?

So, I decided that I was going to do something with my fitbit data.  I had to learn something from looking at close to a year's worth of data, right?  I hope.  My first thought was from my scale, it seemed that my body fat routinely increased when my weight was going down.  This didn't make intuitive sense, and it might not actually be true.  Time to test it with data!

My first attempt to do something useful was in Excel.  I quickly grew tired of the stupidity of scatter plots in Excel.  It was because I generated this plot:


Given that I had some issues with my labeling, I will be clear: the x-axis is weight (in pounds) and the y-axis is body fat (in %).  The black lines connect data points in time sequence.  You will note that there seem to be some clusters of data where the points are just a little bit too nicely spaced from one to the next, almost forming a solid line of light blue boxes.  This is due to fitbit cheating.

Biggest complaint so far is the fitbit is inconsistent about representing missing data points.  For blood pressure, they show up as "0/0".  For weight, they come as interpolated data points.  I basically have to algorithmically filter them out or I'm not going to have a very believable relationship between weight and fat.  I gave up trying after doing some conditional statements.  I decided that R would be a better tool.

To be able to use R, I was going to have to export the data I wanted to a csv.  I decided to find an online R platform and did find one that uses my favorite plotting package (ggplot2).  I am a big fan of R-serve, which I use at work all of the time, but didn't find any servers online that let people mess around on it.

Unfortunately, it didn't like the forward slashes in the excel csv export.  I foolishly decided to use TextWrangler's grep'ing capability to find and replace the dates.  I say foolishly, because I thought that I remembered how to use it, but didn't really.  I had to refer to some find and replace strings I had pulled together for work.  Ultimately, the date find took the form of:
(?P\<month>\d)/(?P\<day>\d)/(?P<year>\d),
and the date replace looked like:
20\P<year>-0\P<month>-0\P<day>,
I got there, anyways.  It then did make it quick to replace the slash in the blood pressure column.  I wanted to replace "0/0" with "0,0".  And I replaced the "Blood Pressure" with "Systolic" and "Diastolic".  This was a simpler find string, that I made too complicated, perhaps:
,(?P<systolic>\d+)/(?P<diastolic>\d+),
 and the replacement:
,\P<systolic>,\P<diastolic>,
The only problem was that I had 352 rows of data and it made only 351 replacements.  I hate searching out the one problem child.  One trick is to throw the csv into excel and look for column shifting (I've created a new column with this replacement).  It should be pretty quick given that excel will generally automatically recognize the comma as the column separator.  And if it doesn't - "Text to Columns . . ." works in a jiffy.  It turns out that one row has a non-interger reading for the diastolic, so it didn't match my grep.  Rather than fix the pattern, I made an ad hoc change to the csv.

And . . . it wouldn't upload.  So, back to the desktop version of R.  So here is my best effort so far (using ggplot2):
The plot also includes a linear regression of the data and the standard error band around the estimate.  Despite my initial theory, it looks like my body fat readings are reasonably well correlated with my weight, but only reasonably well.  Its a 20% adjusted R2, but the p-value for the model is 4.1e-8.

The code to generate this is right here:
codehere::codehere
It occurred to me that I should really first be interested in how my weight evolved over time.  This is what fitbit.com provides:
I think it hides too much of the variability.  Here's my plot:

The code to generate this plot is simpler:
ss
Clearly, my goal weight is 160 pounds. I'm a bit closer than the last data point here, down to 163.5 pounds.  But what is driving my weight gain & loss?  Is it potentially related to my activity level?  Let's see.

Let's at least start with what my calorie burn looks like.  First, just day by day (are there any trends) and then by day of week.  Here we go:
and the daily average (violin plots with the mean represented as a green dot):
By far, it looks like Saturday is my busiest day, in terms of activity.  But the distribution of Saturday overlaps the other days of the week from lazy to super active.  I guess I have my slow weekend days, as well.  Another view is very derivative and slightly easier to create - that is a monthly view:
I really don't know why January looks so high (this is only the tail end of January 2013), perhaps its just the small number of data points.  It does look like after summer I slowed a bit down and picked it up a bit through November, getting lazier in December again.

And the code to generate the above:
codehere::codehere

Not surprisingly, there is a strong relationship between steps and calories burned.  I think that it'd be reasonable for some dispersion due to other activities that I log (e.g., lifting, biking).  It looks like the outlier day of less than 5,000 steps, was one where I rode 50 miles on my bike.  So I guess it could make sense.

This simple relationship has an adjusted R2 of 49%, roughly meaning that about half of the variability in calorie burn is explained by steps.  What really happens by fitbit is a bit more complicated.  I believe that their calculations incorporate the amount of time that you are active - in various states.  The better model would look like the following:

This model (its construction shown in the code below) has a 90% adjusted R2 with and an F-stat for the model of over 825.  I think that indicates significance.  Interestingly, the time sedentary is not a (very) significant variable, but the model thinks that the intercept is.  Which probably makes sense, indicating that there is just a baseline number of calories that somebody of my (roughly constant) weight would burn.  The true model that they use is one of both body mass and activity.  They report Activity Calories burned and the above model against that has a 98% R2, so I think that indicates a match.

codehere::codehere

Now let's get to a more interesting question/hypothesis.  My guess would be that the more active I am over a period of time, the more likely I am to lose weight.  Seem reasonable?  Let's check it out.  First we need to think about the variables available to us and whether they'd reasonably be expected to show a relationship.  I think that the answer is no, given that all of the observations are for an individual day (no trending).  Keep in mind that the best we are going to be able to do is capture the outflow or burning of calories.  I haven't tracked my food intake over any meaningful period of time.  And fitbit doesn't provide that in the dataset anyway.

So, I will have to do some transformations first, but I'll save that code until after the graph.  So let's look at a weekly time period (average activity calories in a week) versus the weight change over the course of that week in the form of weight(date)-weight(date, seven days later.  With that defined, I can look at the visuals:

While at first glance, this doesn't seem super good.  Its mostly a cloud; don't let the fit line and the standard error band trick you.  If there was something there, it would be in the direction that I think that it should be.  That is to say, as my activity increases, my weight decreases.  The model's adjusted R2 is only 3%.  This isn't to say that there couldn't be a different date range that we should be looking over, but I would be concerned that we are finding a spurious relationship.  Even if we did find something, I think that it would be worth doing an in-sample/out-of-sample test for significance.

Before I forget, here is the code:
codehere::codehere

So I'm out of reasonable questions to answer with this data.  How about some random questions:

  • Do steps correlate with Floors climbed?  (i.e., when I'm active, am I active in both ways)
  • How long do I sleep when I track my sleep?  I have been super inconsistent in tracking it, even though I use my fitbit as an alarm clock on weekday mornings.  How does the fitbit data compare to the AskMeEvery data that I've been collecting for the last two weeks?
Let's tackle these.  For the first, here is the graphic:
With a 35% R2, I think its safe to say that there is some correlation, but its really not definitive.  See the data point at about 17,000 steps - that would be one floor climbed that day.  It can happen.  I guess.  150 floors seems like a lot.  but I also took ~17,000 steps that day.  So there is something there.

And now for sleep.  Here's what I've gotten from AskMeEvery:
So I sleep a lot on weekends.  I try to get 7.5 during the week, it happens, though 7 is much more likely.  How did the fitbit data compare?

Given the limited data from AskMeEvery, I think these are essentially equivalent.  What it does indicate, I think, is that on a night that I think that I get 7.5 hours, I'm really getting ~6.  The rest of the time is getting to sleep and restlessness during the day.

Finally, the code for the last two fitbit graphs:
codehere::codehere

Reflection on the data and FitBit

I have looked around on the web about what people think or have learned by using their FitBit.  It can be summed up as:

  • I didn't realize how sedentary I am/was
  • I walk more because I'm wearing a FitBit
  • I like getting badges
Aside from general behavior changing on the margin, I'm unimpressed with what I see out there.  I think my original review holds up fairly well.  I think the following two changes should be made (at a minimum):
  • Add the alarm alerting you that you've been sedentary too long.  Let the user choose this, but provide links or other guidance on the website about what might be a useful interval.  I addressed this in more detail in my initial review.
  • Add the option that goes beyond their "Step Goal Milestones" that alert you when you've hit "75%, 100% or 125% of your daily goal."  These are fine notifications, but what if you just get to 70% of your goal.  You never really know that.  I'd prefer time-based notifications that put your day into perspective, allowing the user the time(s) of the day that they'd like to receive them (for me, 8am and noon would be most useful).  I want to be motivated beyond just a fixed goal, which I'm sure most users have never changed from the day they set up their FitBit.  The message I want is like the following:
    • You've taken X steps so far today.  This is at the Yth percentile of the last week and Zth percentile of the last month.  Only A steps until your goal!
To keep its users engaged, I think FitBit really needs to do more.  I'm not super convinced the solution is my first anniversary email:
Maybe it'd be more interesting to tell me where I fit in the distribution of all FitBit wearers.  Answer questions like the following:
  • I've worn my consistently over the last year.  Have I worn it more than 90% of their customer group?  Shouldn't that make me feel good?
  • For people my age (weight/sex/zip code), where am I in terms activity over the last 12 months? Weight gain / loss?  Body fat gain / loss?
  • Talk some about the FitBit community - in aggregate, how many pounds lost, miles travelled, steps taken?
  • [I will add more as I think of them]
Other thoughts out there?

Wednesday, January 8, 2014

Protection!

I am renting out a room in my house and went downstairs to check on the smoke detector.  When I did, I found that there was no smoke detector.  I did some quick research on Amazon and found that I would probably spend about $40 for a combination carbon monoxide detector.  I thought for a bit and broke down and bought a nest Protect.  It came today.


While it was more complicated to install than your average everyday smoke detector / carbon monoxide detector, it wasn't very hard.  It does, however, require a mobile device to which you can connect to it via wifi.  Given that I already had a nest Thermostat (and a nest account), there was little to do during the setup process except for name the device (choose the room) and enter my Wifi network's password.  The folks at nest thoughtfully included the first batch of six AAA batteries; I'm hopeful that they last for some meaningful amount of time.  If things go well with this one, I will consider adding more to other levels of the house.  Or, it might be better to wait until they add thermometers to communicate with my nests.

Four screws (provided) later and I was good to go.  Now, my phone will get an alert when the house is on fire!

Tuesday, January 7, 2014

Frozen Nest

I've just had a reasonably bad experience with my nest.  I guess I had recently agreed for the nest to manage our heat based on the auto-away setting.  When I got home last night from work, the house was crazy cold (60º upstairs), despite the fact that the boys were home all day.  They were home, but by and large just sitting in front of their computers, not doing much of anything involving motion, which the nest would have picked up on indicating that they were home.

The reason that I'm unhappy with this nest experience is two-fold:

  • The low temperature tonight is forecasted to be in the single digits and I'm concerned about pipes freezing.  I wanted to keep my house reasonably toasty to minimize the potential for burst pipes and expensive clean-up.
  • The nest was never able to bring the house up to temperature.  When I went to bed, the nest was up to 65º - not comfortable - and the furnace has been running non-stop.
What this points to is the following:
  • The auto-away feature should have an additional limit: this should be the amount of time that you are willing to wait for the house to get back up to the target temperature.  This should take into account your house's modeled behavior and the current and forecasted temperatures.  I should be able to say that I never want it to take more than 1 hour to get to temperature.  The nest would then ensure that it could get the house up to that temperature (and if somewhat extreme temperatures were forecasted, it would take that into account, dynamically setting the low temperature mark).
Let's get on it nest!  Or other software providers, now that the API is supposedly available.

Saturday, January 4, 2014

Nest - Data-free Badness

I have two Nests installed in my house for my upstairs and downstairs HVAC systems for some time now (since 2012-02-26), but I haven't written about it since I made my first post with my initial thoughts after installation.  I thought I'd share my current thoughts on the product.

What's Awesome (in no particular order):


  • Super easy installation.  Best in class instructions.  Covered more in my original post.
  • Ability to control my thermostat remotely via an iPhone app.  Cool stuff.  My kids (who live on the upper level) and I have a bit of a war with setting the temperature during the summer, but I always win.
  • Auto-Away: it learns when you are not around and automatically cuts back on your energy usage.
  • Monthly reports: Nest will email you a short summary of your energy usage and 
  • Beautiful design: it looks good and its easy to use.

What's (still) Not Awesome

  • I still can't get my data.  It turns out that I could have if I was willing to go through some gymnastics (see Gregory Booma's blog post - he provides a script to import the data into R, one of my favorite tools).  Unfortunately, Gregory has updated this as of 2013-12-03 saying that due to Nest introducing their API, the functionality described in his post is not available any longer.  I'm inclined to sign up as a developer, but would rather have intermittent access to all of my historical data, I don't want to have to set up a server to capture my data.  This was a $250 thermostat.  I think that they could give me the ability to download a CSV every so often.
  • Given that I can't look at my data myself, the app and website still seem very underdeveloped.  You just can't look at much.  10 days of history is it.  That is pitiful.  See the graphic below.  Its crazy how much more my system ran last night (almost non-stop) when the temperature fell to 10ºF.  Sadly, that's a lot of propane.
    • But what was running?  Was it just the fan running (I have the fan on the 15 minutes an hour schedule for greater comfort), or was the furnace chugging away burning propane?  Really guys, you couldn't figure out a way to represent this on the same graph?
    • Why are graphics covered up to the point I have no idea what they are?
    • Why can't I see a sparkline of the temperature in the house?  The humidity?  This seems stupidly obvious that I'd want to be able to see the history here.
  • I can't connect the Nest to supplemental temperature sensors.  I'd love to have between six and 12 wifi (or Z-wave) sensors reporting to the Nest and be able to set up rules such as:
    • Run heating (cooling) if any of the sensors gets below (above) a defined set point.
    • Warning if the temperature gradient is higher than a defined level (e.g., 10ºF).  Help me trouble shooting by showing me the variation of all of the sensor over time.  I'd love to be able to tweak my registers in a way that limits variation and increases comfort.
    • Allow me to activate register boosters (my term): basically, registers with supplemental fans to increase airflow to particular rooms or sections of the house. 


Friday, January 3, 2014

A Failure to Broadcast

I blogged before about doing some outdoor wifi so that I could use my outdoor stereo system.  While this setup worked in the testing phase, it turned out not to work after everything was installed.  It also had the disadvantage that I could no longer use my Aria wifi scale - setting up the Amped Wireless SR10000 required me to turn off the wifi band that the scale depended on.  After I had figured out the system wasn't working, I switch my Apple network back to including the missing band (I think it was g).  I've been bouncing around in my head how to solve this problem and just recently concluded that I should just use another Apple product.

To be clear, what I mean to set up is another AirPort Extreme (APE) extending my Apple-based wireless network to the outside.  The key, though is the ability to get the external antenna hooked up to my APE.  This is where the research came into play.  And I will admit, it was slow going on my end.

I did find a website that sells modified APEs, but they were substantially marked up (to roughly $135). [I can't find the link at the moment, but when I do, I will link it]. This was way more than I wanted to spend, so I kept looking.  Eventually, I found this installation guide on MacWireless.com.  While the site also sells the connectors, I decided to hunt around for them elsewhere; I am much more interested in buying on a trusted site.  Amazon didn't let me down - I found just what I needed.  Free shipping, too.  That said the shipping time is between 17 and 28 days.  No worries, though, I need this for summer not for the present.

Right now, here is a view of my network configuration (with a big thank you to Gliffy - of which I'm a big fan), click through to get the full-sized version:

I'm thinking that I'll I will need to do is find a 10/100 Mbps repeater and replace the APE with that and then move my APE into the box hanging off of my eve.  The APE then would be directly connected to the large, externally mounted antenna.  From there, I'd up the power level to the max and see how my wifi coverage looks throughout the house and outside.  If I need the additional interior coverage, I will purchase another APE and place where my existing one is.  I'm debating whether or not I really need a Gigabit switch, or if 10/100 Mbps will be sufficient (and I think I have one lying around).  I think that will be the first attempt and then if it doesn't work, I'll look at purchasing a $40 5 Port Gigabit switch.

A minor change to the configuration represented below (and a more complete representation of the network, including wireless devices and labeled cable runs):

I think that's a pretty full network - 50 devices directly connected.  And I'm guessing that I don't have the most of folks out there.

Wednesday, January 1, 2014

Tracking Your Critters

It has been said that over half of the genetic material in or on our bodies is not human in origin.  One area where there has been a lot of study in the last five years has been on the flora in our guts.  The bacteria in our guts has been linked to various problems - ulcers, obesity, stupidity (ok, maybe not the last one).  There is now a start-up out there looking to help you understand your very own flora.  They call themselves uBiome.  I think that they should be call uPoop and then you can have an iPhone app called (obviously) iPoop where they deliver the results of your most recent stool sample.

Joking aside, this looks super interesting to me and when the prices drop to something less than $89 per sample, I might dive in (so to speak).  I do have some concern that the FDA will shut them down in a way similar to 23andMe, so I'll wait until that gets sorted out, too.

Your Data is Yours for a Price

I've recently blogged about how I think personal data should be owned by those generating it.  Related, I was just thinking about how long I've owned my fitbit - almost a year - and that I was going to have to write a post summarizing my thoughts on living with a fitbit for a year.  As part of that, I was ready to bitch and moan that they do not provide their users with access to their own underlying information.  To make sure this was a true statement, I did some quick Googling.

What I found was not very satisfying on a few different levels:

  1. Access to your own data is only available if you a premium subscriber - that is, you pay an additional $49.99/year (after you've already bought the device).
  2. Even when starting a seven day free trial period to download the data, it is only available with nothing greater than daily granularity.  A bit of poking around on the Google Group for fitbit developers indicates that only a "select few" developers will have access to sub-daily data.
  3. The data is presented poorly - there are three separate sections in the CSV: body, activities, sleep.  All are keyed on date.  Why not one section with more columns?
  4. The data has odd inconsistencies:
    1. If you request dates for which you hadn't used your fitbit, you will get information that isn't real - i.e., no minutes of activity rather than null values
    2. For data where you have limited entries - weight, blood pressure, resting heart rate, etc., fitbit does different things: weight is repeated, blood pressure is 0/0 instead of NA, resting heart rate is 0 instead of NA.
I'm still down on the whole environment that is out there now.  That said, I signed up for a MeetUp group (for DC) that is associated with the Quantified Self.

After I've had a chance to go through the data, I will share my insights (if any).