Fact-checking ChatGPT health advice: How to build your own health AI without leaking data

While OpenAI tries to convince us that sharing our medical records with them is safe, many have already started looking for ways to build their own Health AI. The idea is simple: pull data from watches/rings, run it through a local LLM, and get insights without leaking anything. This continues the discussion from the previous post.

What data do you need — raw or pre-processed?

At first it seems cool to get raw sensor signal (photoplethysmography, accelerometer). But reality: you get a noisy voltage stream. To turn it into heart rate, you need complex motion artifact filtering algorithms.

For an AI assistant, we don't need raw signal. We need aggregated metrics the device already computed: HRV, Sleep Score, Readiness.

Not everyone agrees. A recent Nature article argues that feeding raw data to a neural network so it can find patterns itself is ideal.

Where to get the data?

Oura Ring API — Data can be pulled directly. The community confirms it's the most developer-friendly API. Clean JSON with sleep phases and activity. Python libraries available.

Fitbit & Google — Rough situation. Google is slowly killing the old Fitbit ecosystem. They shut down the web dashboard, API is migrating. Too unstable for a pet project.

Aggregators (Thryve, Terra, Vital) — A "universal adapter" for multiple devices. Connect all devices (500+ supported), get unified JSON. Free tier is limited (up to 50 users), but ideal for personal use.

Bottom line

Take Oura (or an aggregator like Thryve on the free tier), pull cleaned trends (HRV dropped, heart rate spiked), feed them to a local Llama. A model that sees dynamics gives better advice than a clinic doctor, and your data stays with you.

UPD: we were wrong to fear raw data

A recent Nature article says the problem with wearables isn't sensor accuracy — it's the formulas that average out readings and give you something approximate (calculating HRV etc., which gives you a baseline with no detail).

Turns out, if you feed a neural network that very "noisy" raw signal (the one we wanted to filter out), it finds unique personal patterns invisible on standard cleaned-up charts.

Analyzing this noise by hand is still painful (I was right about that), but you absolutely must stockpile raw archives. It's a golden dataset for future Health models that will compare you not to the hospital average, but to your past self.