I analyzed the Stack Overflow survey and found a stark contrast in the salaries of software engineers, youth, interest in new tools, opinions about AI, ethics and more...
Indian developers have historically been working at a fraction of the salary of their counterparts in the developed countries. Much has been written about how the developers in those countries feel threatened (or not threatened) of losing potential jobs to cheap outsourcing to countries like India.
When I saw that India has the 2nd largest responder count for the Stack Overflow survey of 2018 and that the survey had a question asking about the responder's salary, I decided to see for myself how cheap Indian developers actually are.
So, I drew some KDE plots to compare the annual salary of Indian developers and the annual salary of developers all over the world:
KDE Plots for an annual salary of developers in India (left) and an annual salary of developers all over the world (right)
And then I drew similar plots for the other 3 top responding countries of the survey, i.e, United States, Germany, and the United Kingdom:
See that sharp spike near $0 for KDE plot of Indian developers?
That is when I decided to completely dig into this survey. I wanted to explore what made Indian developers so different.
The top 5 countries which drew the maximum number of responses to this survey (in decreasing order) are:
So, I drew up the overall stats for all the responders for each question asked in the survey and then I compared it with the corresponding stats of responders from India only. To make the comparison even more conspicuous, I have also shown the stats for the other 3 top responding countries - the US, UK, and Germany.
Here's what I found:
A large majority of the developers from India work at salaries lesser than their counterparts in other developed countries by orders of magnitude.
Not a lot of Indians consider ethics when writing code.
India has a larger proportion of developers with a college major in a CS-related field as compared to the entire world or the other top 3 responding countries.
India has a much younger developer population as compared to the other groups. They are competitive and they are ambitious.
Mobile development has a unique appeal in India.
Most Indians are interested in all new hypothetical tools.
Opinions of Indian developers about AI differs a lot from the rest of the world - we are more worried about automation of jobs, excited about AI making important decisions, more worried about a singularity type of situation than the world and not so worried about evolving definitions of "fairness" in algorithmic versus human decisions.
A lot of Indians contribute a lot to open-source software but don't think that they are learning from doing so.
Read this article further to see the visualizations and understand more about these results or check my complete report on Kaggle (and do upvote it if you like it ).
I have only recently started out with Data Science and I'm not sure if the approach that I have taken in my analysis is the best one possible (or even correct). So, I would really appreciate some feedback/criticisms about what I have done right/wrong.
More on salary:
Salary stats - count, mean and the 25th, 50th, 75th, and 90th percentiles - for various groups
Average annual salary of developers in India is almost 2.5 times lesser the average salary of all developers all over the world, almost 5 times lesser than the average salary in the US, almost 4 times lesser than the average salary in the UK and almost 3 times lesser than average salary in Germany.
More than 50% of the Indian developers are working on a yearly salary of less than $10,000.
The median of annual salaries for the entire developer population in the world is ~3.5 times more than in India. It is ~10 times more in the US and ~6 times more in both UK and Germany.
[Note that I have only included the responders who have jobs in the above plots. I have not counted the ones who have 0 salaries.]
So, are the salaries of Indian developers so low because they are less educated in the CS related fields than their counterparts in other countries?
India actually has a higher percentage of responders who have a Bachelor's degree than the other groups.
It also has a higher percentage of responders with a Master's degree as compared to the other groups except for Germany.
India has a much smaller percentage of responders who have a doctoral degree (like Ph.D.).
Let's see what they majored in..
There is a higher percentage of CS majors in India as compared to the other countries taken into consideration and the world taken as a whole.
Well, we can be sure that it is not the lack of a formal CS degree that's keeping the Indian developers from high paying jobs.
Then what is?
Maybe a larger proportion of Indian developers are not coding out of interest but merely to earn a salary. This could result in them producing sub-par code which isn't of much value. Let's see if this is the case...
Indian developers are just as interested in coding as any other developer group.
But even then, it is a reality that a large proportion of cheap outsourcing work is done by developers from India. So, why is it that such a large proportion of India developers are kept away from the top paying jobs?
I believe that a combination of living conditions, quality of education, and the country's economic structure handicaps an Indian programmer's ability to even compete with the ones from more developed countries like the US, UK, and Germany.
How else could you explain the poor salaries despite that excellent CS-major percentage?
Age and experience
93% of the Indian responders are aged between 18-34 years whereas this number is 73% for the world taken as a whole and even lesser for the other 3 top responding countries.
If we calculate the percentage of developers who have been coding at a professional capacity for 0â??5 years, it is more than 78% in India whereas it is 48% in the US, 46% in the UK, 55% in Germany and 57% in the entire world taken together.
Clearly, the developer population is extremely young in India.
I believe that a major reason behind this stark variation in the developer age and experience between India and the other groups are taken into consideration here, is that India, being a developing country, was introduced to computing technology much later than the others. This resulted in India producing very few programmers in the 20th century or even in the early 2000s.
This probably results in Indians mainly contributing to the younger workforce of companies, occupying mostly the entry-level jobs, which are probably paid lesser than the more experienced roles. This is also probably a factor in the low average salaries of developers from India.
It also means that the job market for the junior developer roles is incredibly crowded here in India which might explain why the Indian developers are particularly competitive in nature...
More than 50% of Indian developers agree that they think that they are competing with their peers and another 25% are neutral on the topic.
The survey had a few questions relating to ethics in coding.
The responsibility of unethical code:
While almost 60% of the developers in all the other groups would hold upper management responsible for unethical code, only 40% of the developers in India do so.
A much larger proportion of Indian developers as compared to the developers in the other groups would hold the person who came up with the idea of unethical code or the developer who wrote it as the one responsible.
Write unethical code?
A majority of the Indian developers are ready to write code that they themselves consider unethical.
Consider the ethical implications of your code?
Only 63% of the Indian responders believe that they have an obligation to consider the ethical implication of the code that they write.
Perhaps its a bit difficult to consider the ethics when you have a low paying job.
Imagine being in a country with a high unemployment rate, low salaries and a highly competitive job market. And then, imagine being asked to write an unethical code at the job that you have or risk making your boss unhappy and losing the next big promotion to your co-worker who agrees to doing it or maybe, even risk losing the job itself!
So I believe that if a developer agrees to write unethical code, it does not necessarily mean that he/she is an unethical person. Sometimes its the same dilemma as the choice between stealing bread or letting your family go hungry. The correct choice isn't always black and white.
The popularity of mobile app development:
We can see that the orange bar representing India sticks out in the "Mobile developer" category.
In fact, India has the largest mobile developer count in the entire world:
A large majority of the Indian population might have entirely skipped the PC revolution and leapfrogged to smartphones as their first personal computing devices. Hence there is so much excitement around smartphones and mobile apps. I think that is a major reason behind the interest that we see in mobile app development in India.
We can also see the effect of the popularity of mobile development on the choice of development platform amongst the Indian developers...
Android and Firebase are among the top 3 most popular platforms in India. These 2 platforms have outranked Windows Desktop or Server in terms of popularity amongst the Indian developers.
And, the interest in them is still growing because even larger percentages of developers want to work on them next year!
Note that Firebase is nowhere even near the top 3 in any of the other groups seen above except India.
Opinions about AI
The survey had a few questions relating to the advancing AI technology. Two of them were:
What do you think is the most dangerous aspect of increasingly advanced AI technology?
What do you think is the most exciting aspect of increasingly advanced AI technology?
Both of them had the same set of options viz.
Algorithms making important decisions
Artificial intelligence surpassing human intelligence ("the singularity")
Evolving definitions of "fairness" in algorithmic versus human decisions
Increasing automation of jobs
Therefore, I've plotted separate graphs for each of the above options trying to see whether a majority of the responder group sees it as a danger or does it consider it as a reason for excitement:
Potential automation of jobs by AI is not so interesting to us Indians. We are more likely to put it in the Dangerous "bucket". This is unlike the rest of the world taken as a whole as well as the other 3 top responding countries, where a larger proportion of the population is interested in it.
Indians are much more comfortable with a future in which AI is making important decisions for them than any of the other 3 countries taken into consideration or the world is taken as a whole.
Indians are more worried about a "singularity"-type situation than the other groups are.
Indians are not as worried about the evolving definitions of "fairness" in algorithmic vs. human decision as the other groups are.
Interest in hypothetical tools
Stack Overflow wanted to know what new hypothetical tools is the developer community interested in. It asked the responders to rate four hypothetical tools from 1 to 5 where 1 denoted "Extremely interested" and 5 denoted "Not at all interested".
I found that the Indian developers are much more interested in any new hypothetical tools that the Stack Overflow team proposed as compared to the developers around the world or the developers from any of the other top 3 responding countries.
I thought that maybe this interest in all new hypothetical tools is a result of the young age of the Indian developer community that we saw earlier. Therefore, I grouped all the developers according to their years of coding experience and calculated the percentages of developer belonging to different interest levels for every experience group.
Here are some heat maps representing those percentages:
Clearly, there is a strong correlation between the number of years one has been coding and his/her interest in new hypothetical tools. Younger developers tend to be much more interested in new hypothetical tools than the older developers.
This contrast is most striking for the tool - "A private area for people new to programming" (1st row, 2nd figure). Its clear that young developers who are new to programming, want a private area for newbies.
Maybe, its because the existing Stack Overflow can be quite harsh and unwelcoming for newcomers. However, the older developers are very much uninterested in this tool.
Maybe its the curse of knowledge. Or maybe the more experienced devs believe that the existing tools are fine and sufficient because, well, they have made a career out of the existing ones. Or maybe, somehow, their experience in the field pessimistic about the success of such tools.
Whatever be the reason, it is clear that if the Stack Overflow team does decide to go forward with any of the above-mentioned products, younger people are more likely to welcome it and India might be a particularly good country to try that the launch.
The peculiar thing about Open-source contributions:
India has the largest percentage of open-source contributors among all the groups considered in this analysis.
Now, here's the interesting part...
There was another question in the survey asking -
"Which of the following types of non-degree education have you used or participated in? Please select all that apply."
One of the options for this question was - "Contributed to open source project".
I thought that contributing to open-source software doesn't just help the community but it also helps the contributor in his/her personal growth. So, the people who do contribute would also mark it under non-degree education.
But it turns out that that's not the case...
While 50% of responders from India contribute to open-source software, only ~30% responders feel that they are learning something from doing so. This is in contrast with the entire world and the other countries taken into consideration, where the 2 ratios are almost the same.
I am not sure what to make of this. Maybe it means that a lot of Indians who do contribute to open-source projects aren't regular at it. Let me know what you think.
In this analysis, we saw the various areas in which software developers in India differ from software developers in other countries.
India had a late start in the world of computing, as evident by the low percentages of older developers that we saw above but software development has become incredibly popular with the current generation. For much of India's recent history, working in IT and software development has become the surest ticket out of poverty.
We also saw that the developers here realized the potential of the last major revolution in tech - mobile apps. This makes me hopeful that we are catching up fast. Maybe, we will even play a pivotal role in defining the next major tech-revolution (in Artificial Intelligence, maybe).
You should check out my complete analysis on Kaggle for more insights like how Indian developers differ in the methods of self-education adopted, IDE used, programming methodology used, the popularity of various languages and frameworks, use of ad blockers, operating system, Stack Overflow usage and more. Also, please do upvote it or comment if you find it interesting.
I have only recently started out with Data Science and I'm not sure if the approach that I have taken in my analysis is the best one possible (or even correct). So, I would really appreciate some feedback/criticisms about what I have done right/ wrong.
Other ideas that will be interesting to explore
Jason Goodman advises that when building data science projects, pick something you're curious about. If it's something that interests you, it'll be more fun and you're more likely to find interesting angles with it.
I tried taking this philosophy home by analyzing the Stack Overflow survey data to find out how the developers from my country differed from the others.
Doing this lengthy analysis has made me curious about a few other questions too. So, here I post all the questions that I wish to ask this dataset -
Do people learn by contributing to open-source projects?
I assumed that people do. But it turns out that not all of them do (like the 20% Indians that we saw in this analysis). So, it would be interesting to study this more.
How do opinions about AI differ?
In this kernel, we saw how the Indians have many different opinions about the various aspects of advancing AI tech. It would be interesting to draw comparisons in other groups based on age, developer jobs, education, frameworks, languages, salaries, etc.
What do the students think?
1 out of every 5 responders to this survey is a full-time student which means that we have the survey data for 20,000 students. I think that this provides a unique opportunity to explore what's popular among the next generation of software developers (myself included :D).
How Vim users differ from non-Vim users?
This is just because I use Vim. I love Vim. It would be fun to make something like this excellent R vs. Python analysis.
A salary predictor
Stack Overflow recently launched a salary calculator using this very dataset. It might be fun to try and build a similar thing.
The exciting part is that it is possible to arrive at satisfactory solutions to all of them by analyzing this very dataset!
I plan on exploring some of these topics myself. But feel free to go ahead and do your own analysis if you find any of the above questions interesting. Also, if you found this analysis interesting, you can do a similar one for your own country.
It would be awesome if you could tag me in the comments or shoot me a tweet on Twitter or a text on LinkedIn when you (/if you) decide to make your work public.
Do check out my complete analysis - "How Indian developers differ" on Kaggle. If you found my analysis interesting, I would really appreciate it if you could upvote that kernel.