In January 2022, Nationwide® Veterinary Analytics released the first of three planned white papers. Before we talk about the key points in this initial doodle data, I thought it might help to discuss big data -- what it is, what it can / cannot tell us so that you can interpret and think critically about any big data you come across. I'm presenting the info both regular content and video. Choose whatever works best for you. Scroll all the way to the bottom to see the video.
New Doodle Data
This new white paper from Nationwide® Veterinary Analytics covers these data points:
- Popularity of doodle-dogs and parent breeds
- Relative risk for cancer claims for doodle-dogs and parent breeds
The data comes from the records of 1.61 million insured dogs over 6 years (Oct 2015 - Sept 2021). You can download and read the announcement, white paper, and details on the methodology and math. I'm big on primary source documents, so I encourage you to read it yourself.
Understanding Big Data
Big data typically features these characteristics:
- Velocity -- data getting generated faster than ever
- Volume -- bigger data sets now accessible and analyzed thanks to powerful computers
- Variety -- data often comes from a variety of sources, including from outside typical scientific inquiry
Big data is different than other studies, though. Scientific studies typically happen via deliberate data collection in a controlled / self-contained environment, with specific objectives for analyzing the data. Big data, on the other hand:
- Potentially comes from non-scientific sources such as the Internet of Things
- Potentially gathered for reasons *other than* scientific analysis
Not to sound like a conspiracy nut, but sometimes, we know those real reasons organizations gather so much data and how they (or others) actually use it. Sometimes we don’t.
As an example, if your bed has an app or you use an activity monitoring device with an app, all that generates data about you. Even if those big data sets reveal interesting things, their collection and usage may be other than you think. So, it's good to think critically about where data comes from and its intent, influence, and impact.
That said ... Nationwide® Veterinary Analytics *is* the research arm of the pet insurer, so I am NOT pointing at them or poking their efforts in any way. I think this data is both reputable and interesting.
What Big Data Can and Cannot Tell Us
- Big data may reveal relationships, patterns, and trends that then spur more targeted scientific inquiry.
- Big data cannot uncover causality or correlation. That requires additional investigation.
- Big data sets also suffer from self-selection or self-exclusion challenges. For example, this doodle data comes from dogs with pet insurance through Nationwide (self selection). Plenty of dogs -- doodles or otherwise -- do not have insurance through Nationwide (self exclusion).
The Doodle Data
The white paper says, “Doodle popularity is up, and Doodle parent breed popularity is down.”
The white paper says, “Doodle owners are considerably less likely to have submitted a claim for cancer diagnosis or treatment... Relative risk for cancer claims is dramatically lower in Labradoodles and Goldendoodles in comparison to their contributing breeds — Standard Poodles, Golden Retrievers, and Labrador Retrievers.”
Look for the explanation / math for calculating relative risk in the methodology document. I had to read it several times, and even if I tried to explain it to you, I'd probably fail.
Doodle Data Video
Here's the video I made, if you'd rather hear me talk about all this. Plus, there is bonus chatter at the end about our one and only doodle foster puppy. A sweet gal named, Maggie.
Muppet Doodle Puppy
Bonus photo of the doodle foster puppy, Maggie, who stayed with us for a while in spring 2021. She is my only firsthand experience with a doodle dog. Check out our most recent foster puppies (Tato and Gravy)!