Data quality and bias
This page contains analysis and opinion by Dave Verwer.
I believe that the data collected by this survey is generally of very high quality. Here’s how I came to that conclusion:
Number of respondents
2,290 people answered the questionnaire, and the overwhelming majority stayed engaged until the very end, which is remarkable for a survey that had more than a hundred questions!
One indicator of high-quality data is that the number of respondents per question stayed high throughout the questionnaire. Below are some response rates for various universally-applicable questions in the survey:
|Question #||Answered by|
|Section 1 – Question 1||99.7%|
|Section 2 – Question 15||99.4%|
|Section 3 – Question 25||99.3%|
|Section 4 – Question 30||99.0%|
|Section 6 – Question 43||89.7%|
|Section 9 – Question 55||93.3%|
|Section 10 – Question 61||91.0%|
|Section 11 – Question 69||98.7%|
|Section 12 – Question 75||97.9%|
|Section 13 – Question 85||96.9%|
|Section 14 – Question 97||87.1%|
|Section 15 – Question 106||97.7%|
Even on the very last section, more than 97% of people were still answering questions.
It also speaks to the high quality of the data that only a small number of people answered some of the questions. For example, only 8.9% of respondents answered Question 48, which makes perfect sense. These response rates suggest people were not just randomly answering with any option. They were reading the questions and thinking about their answers.
As with all surveys, some level of bias is unavoidable. I believe there are a few things to take into account relating to bias in this survey data:
The gender and ethnicity demographics of survey respondents. Survey respondents were overwhelmingly white and male, mainly due to the demographics of the industry but also potentially because the survey did not reach far enough into the community.
The age and seniority of survey respondents. Survey data is also skewed towards responses from senior developers. The average age group for survey respondents was 30-39, and the average number of years of experience with Apple platforms was 5-10 years.
The time needed to complete the questionnaire. It took most people around 20-30 minutes to answer all 109 questions. As a result, this data is biased towards people who have large amounts of free time.
The promotion of the questionnaire. While the community helped to share the survey questionnaire, the primary source of respondents came from my tweets, and links in iOS Dev Weekly. This biases the data towards the people who subscribe to my newsletter.
Is there bias I have missed? Please let me know.
Analysis and opinion
All opinion content on this site is clearly marked with the name of the author, and a link to their profile. All analysis is going to be more biased than the raw data, but it should be clear where the raw data ends, and the analysis begins.
All data collected in the survey was completely anonymous. No personally-identifying information was collected. All results on this site have been produced entirely from that anonymous data.
With any anonymous survey, there’s always a chance of bad data. However, I’ve spent a lot of time looking over these results and haven’t spotted anything suspicious.
I’ve tried to represent the data accurately when building this site. That said, I’m human, and while I have checked everything to the best of my ability, I can’t guarantee that there are no mistakes. If you think anything doesn’t look quite right, please let me know.
Was there anything that could have been improved in this article? Let me know.