top of page

Data Sources

All the data Garmin spews

TL;DR - Data Sources

What Data is Available: Garmin devices provide comprehensive data, including activity tracking (e.g., running, cycling), health stats (e.g., heart rate, sleep stages), performance metrics (e.g., VO2 Max, training load), and environmental sensors (e.g., weather, temperature).

Data Sources for Research: Official data channels lack the granularity required for in-depth analysis. Instead, GarminDB and python-garminconnect are utilized for their detailed and comprehensive data coverage. Both require advanced Python/scripting expertise to access and store the data for further analysis.

GarminDB Data Exploration: GarminDB is the primary source for historical data, including daily summaries, activity summaries, and sleep records.

Python-GarminConnect: This package is used to retrieve Heart Rate Variability (HRV) data, unavailable in GarminDB, along with in-sport activity details, enabling deeper insights and analysis.

Data Details

lets have a look at all the metrics that the device track and records and calculates

​ Is this too much data ?​
No its never too much data, any data points I can access and make better actionable decisions will take them, the critical part to have
actionable insights

Data sources for research

When researching available resources online, we can broadly categorize data sources into two main types:

  1. Official Sources

  2. Unofficial Sources
     

Each of these sources can be evaluated based on the following criteria:

  • Ease of Access: How readily the data can be obtained.

  • Completeness: The extent to which the data fulfills our requirements.
     

By assessing these parameters, we can effectively narrow down and identify the most suitable data sources for our needs.


To summarize, I have outlined these considerations to help us streamline the process and ensure we select the best possible data sources.

Based on the data requirements and my preference for comprehensive information, I have decided to use GarminDB and python-garminconnect as my primary data sources. These data sources provide the depth and breadth of data necessary for my analysis.

GarminDB data exploration

Installation


Copy the GarminConnectConfig.json.example from the installed folder to  ~/.GarminDb/GarminConnectConfig.jsonEdit it, and add your Garmin Connect username and password and adjust the start dates to match the dates of your data in Garmin Connect for me for this was all the way back from July 2016.           

 

Run the Script: This process can take time as I had a lot of old data I left it overnight to download and process                                 

Lets explore the the 5 databases saved, I mapped these databases in sqllite using DBeaver to explore the data 

A quick check on the erd diagram

Scema
Activity_summary.png
sleep records.png

python-garminconnect

I have sourced most of my data from GarminDB.
But as you can see in the sleep data the HRV variability and different stages of sleep is not available which is a critical welness metric

To get this information I will use the python-garminconnect package 

The hrvSummary column details regarding HRV . heart rate variability 

Lets process them and add come with with a usable dataframe 

hrv_records.png

Data Details

We have identified and consolidated four primary datasets for our analysis:

  1. Daily Heart Rate Data (from GarminDB)

  2. Activity Details Data (from GarminDB)

  3. Activity Daily Summary Data (from GarminDB)

  4. Sleep Data & HRV (from Python-GarminConnect)


Our next steps is to explore/ clean and uncover key relationships between these variables to gain deeper insights and advance our analysis.

bottom of page