6 Quantitative data analysis, data collection and sampling methods
It is important to distinguish between the various purposes for which quantitative, and indeed any kind of, analysis can be carried out. This concerns the sort of conclusion that is being aimed at; or, to put it the other way around, the kind of question that is being addressed.
Read the following quantitative research study on the history of shareholding and investment in England and Wales, 1870−1930, by Janette Rutterford et al.:
- Rutterford, J, Maltby, J., Green, D. R., and Owens, A. (2009) ‘ ’, Accounting History, vol. 14, no. 3, pp. 269–292.
The project was funded by the Economic and Social Research Council Grant. The study collects primary data from historical archives in order to investigate investor behaviour in the late nineteenth century and the beginning of twentieth century.
When you become familiar with the study, answer to the following questions:
What was the motivation of this study/project?
Add your notes here.
Feedback by professor Janette Rutterford:
Yes. The motivation was the investigation of the relationship between gender and investment behaviour during a particular period – 1870 to 1930 – when major changes in investment took place. Investment as a whole boomed at that time with the Limited Liability Acts of the 1850s, the marketing of shares to an increasingly wealthy middle class, the introduction of different new types of security with lower risk to attract risk averse individuals and the increased awareness of investment opportunities with the marketing of bonds and savings certificates to fund World War One. And from the female side, there were a number of changes too. There were the Married Women’s Property Acts of 1870 and 1882, which gave married women rights to hold investments in their own name. And there was also the increasing number of surplus – they were called ‘surplus’ or ‘redundant’ women who needed an investment income to live off, as work was either frowned on or not available and then after World War One was exacerbated by the deaths of men in the war. We wanted to identify changes in investment behaviour by looking at two samples of investment records. First we had a full sample of extant probate records those that were left – assets at death between 1870 and 1902. The remainder of the sample had been destroyed. But testing, we found that the sample that we had was representative of the population in terms both of wealth and of geography across the UK. The second set of investment records, which we’re going to about today, was shareholdings in individual company securities over the period 1870 to 1935.
Given the size of the population of UK investors during the time of investigation, the data collection should focus on a representative sample. How would you choose the companies in this sample?
Add your notes here.
Feedback by professor Janette Rutterford:
Well, we looked at a number of possible differences and we tried to pick companies, which covered the whole spectrum. So for example for sectors we looked at all possible types of sector and broke them down into agricultural, commercial and breweries, breweries were big in the late nineteenth century. Extractive, which means mining, manufacturing, such as steel mills transport and communications of which railways were the major part. And utilities for example gas companies, which in those days were privately owned.
We also looked at the span of time. We wanted companies which had at least two dates of shareholder records ten years apart so we could get some kind of continuity.
We looked at longevity where the companies succeeded or not so we avoided survivorship bias. So some companies which die, went bankrupt, or taken over, were included and some such as Foreign and Colonial Trust which started in 1868 and lasted throughout the period and is in fact still going.
We looked at different sizes, some small, some large. Some were small and became large. Some were large and became small.
We looked at geography and what we mean by that is where the companies operated they were all registered in England or Wales but they operated in either domestically – so for example companies such as Boots the Chemist. We had Empire operations such as Mahratta, an Indian railway company. And we had companies which operated in foreign countries such as the Cuba Submarine Telegraph. Investors in those days weren’t frightened to invest globally even though the companies were all registered in the UK.
Another difference we looked for was different security types. I’ve mentioned the fact that there were ordinary shares but there were also preference shares and debentures which were lower risk and very attractive to individual investors. So we looked at companies which had different variants of the securities.
And finally we looked at both public companies and private companies. What I mean by that is companies where the securities were quoted on the stock exchange where you could buy and sell them with relatively ease. And we also looked at some private companies where really you could only change hands amongst the shareholders. It was very difficult to sell in the outside world.
Could you explain the random sampling of shareholders followed by this research?
Add your notes here.
Feedback by professor Janette Rutterford:
Our research differed because we were looking not cross-sectionally but longitudinally over time. So we found that shareholder numbers went from a few dozen to tens of thousands over time. So it went from very small numbers to very large numbers. We were also collecting a lot of information as well as just the name of the shareholder. Address, gender, marital status, details of holdings, company details et cetera, et cetera. So we had a huge amount of data to collect and we clearly couldn’t do one hundred per cent sampling. Systematic sampling, such as sampling one in ten which I said that Kimmell and Parkinson did wouldn’t have worked either, given the changes in numbers of shareholders per company as sampling one in ten in the early years would have told us almost nothing.
The other approach, which we could adopt, was random sampling. You could do this by say numbering shareholders One to whatever number N and then picking random numbers to sample the shareholder records. BUT there were practical reasons against this.
We would have had to count shareholder numbers per register, which would have been time consuming for large registers. Say Barclays at one point had thirty thousand shareholders, and then finding each individual shareholder, in those thirty thousand shareholders would have also have taken time. So we chose a compromise between random and systematic sampling.
The first thing we did was that we estimated the total number of shareholders by calculating the average number of shareholders per page using five pages. We then counted the number of pages and multiplied that by the average per page to get an estimate of the total number of shareholders. We then identified how many shareholders we needed to sample per register to give us a ninety five per cent confidence level, and a seven per cent confidence interval with respect to the population as a whole. We chose these boundaries as they were satisfactory and tighter requirements led to much higher numbers of shareholders needing to be sampled. In practice, this level of confidence meant we had a maximum of two hundred shareholders per register.
We then generated a random number for each letter of the alphabet of the surnames of the shareholders. A minimum of three letters per register were sampled, to avoid just looking at family clusters. So for example if we’d just taken a B and all the owners, the big directors and things had a name beginning with B it would have biased our sample.
With each random letter of the alphabet chosen, a random number start page was determined. And this was quite important because we had shareholder records which came from different sources so some were the formally submitted to the registrar of companies, which is a formal document. Some we were using the actual shareholder registers that the companies were managing. And so what we had is some shareholder records with ‘sticky’. For example, some lists just added new shareholders at the end of the letter so sampling from the beginning of the letter only would have only selected the original shareholders and we wouldn’t have got the new shareholders that came in.
And we would have liked to track the same shareholders over time because we sampled registers every ten years from the 1870s to the 1930s and some company registers were sampled over a period of forty years. But many share records were not in strictly alphabetical order, and numbers rose dramatically in many cases, so we just couldn’t do what is called cluster sampling.
Janette Rutterford, professor of Financial Management at the Open University Business School and principal author of this study summarises as follows the main findings:
We learnt a lot about investors in general, not just women. For example how where they lived affected what they bought. So if you had London based shareholders they were very keen on Empire securities such as railways from India. Whereas northerners bought local companies for example steel mills in Sheffield or shipping in Liverpool.
We obviously learnt a lot about women investors because that was the aim of the project and we found that their numbers changed dramatically over the period. Women went from being fifteen per cent of all individual investors in the 1870s to forty five per cent - nearly half – by the 1930s. In terms of value of investments they started in the 1870s with five per cent by value and by the 1930s they were up to thirty five per cent by value. So although perhaps in how much they held wasn’t quite as great as men, the change was really dramatic. And in fact we found that in some shareholder registers women were in the majority, particularly for lower risk securities such as preference shares. So we’re back to these women who were looking for low risk investments for an unearned income. And also they like brand names such as J. Lyons, which had teashops, which they used, or Boots the Chemist, which actually sold shares over the counter at one point.
We are still working on the data in terms of the geography of investment in that I mean how it differed over time, and also we’re looking at how investors diversified their portfolios. Was the increase in the number of shareholdings, which we identified in this research project, due to a larger and larger investor population? Or was it the same small group of investors who simply added more and more shares to their portfolios over time?
Listen to the audio, add your own notes and then compare them with the feedback provided.
The paper and the interview begin with the general description of the research project and its main motivation. They also refer to the existing previous research on the same issue. This is very important because it differentiates the current study from previous research highlighting its strong points and its real theoretical contribution. Particular emphasis is also given in the sampling method as the main scope of the paper is to identify characteristics of the shareholder population between the 1870s and the 1930s investigating a ‘representative population of industries’ (Rutterford et al., 2009, p. 7). The interview and the paper (in more detail) offer an analytical description of the choice of sectors and related companies in order to achieve a representative sample. They also address issues involved in the sampling process when working with the existing archives. In research projects of this type, the quality of the results hinges upon the structure of the sample. This is the most important part in quantitative research. A general overview of the process can be summarised as follows:
We have here described the methods we used to compile a database of a representative sample of shareholders, covering UK companies from all economic sectors. We have outlined some of the issues and challenges that face historians in identifying appropriate source material and in devising suitable sampling strategies. […] Providing an extensive, sectorally representative sample, and incorporating companies of varying sizes, ages, capital structures and profitability rates, operating in geographical locations within and beyond England and Wales, the data offer a unique opportunity to investigate a neglected aspect of the country’s financial and social history. As well as casting light on the growing significance of investment wealth in the population as a whole, the data set allows us to investigate with confidence the significance of gender in understanding patterns of shareholding and, in turn, how this was shaped by wider, legal, financial and cultural change. The specification of this methodology along with the construction of a major database for capturing historical data on investment […] will also provide a platform for the research to be extended, chronologically and geographically in the future, and a data set that historians of investment in other countries can use as a test against their own findings.