Monday, March 23, 2015

Link IPEI

https://www.facebook.com/tompkinstrustcompany?sk=app_403834839671843&brandloc=DISABLE&app_data=chk-550c9fe484598&pnref=story

in-class experiment link

https://www.surveymonkey.com/r/YFJM5FS

T-test overview

Often in social science situations, we want to see if there is a statistical difference between two groups.

That is, are any recorded differences due to "chance" (not significant), or are they NOT due to chance (significant)?

To determine if the differences are significant, we use a simple inferential statistical test called the t-test.

Here's how we solve for t:

t = X1 - X2/ Sm1 - Sm2

where X1 = mean (average) of the first group and X2 = mean (average) of the second group, and Sm = standard error of the mean.

So, for example, let's say we have two groups of students in an experiment where we're trying to test whether or not kids can learn their multiplication facts better via a TV show than in school.

Let's say we bring in 20 kids and we randomly assign them to two groups. The first group of 10 learns their multiplication facts via TV show, and the second group of 10 learns via the traditional classroom approach.

So we have something like this:

# of TV kids = 10
# of class kids = 10
Overall N = 20

We examine their scores and we see that the TV kids averaged a 2/10 on a post-test quiz measuring multiplication facts and the traditional class kids averaged a 6/10 on the same test.

Normally, you'd have to solve for the Sm, but-- since this is an 11-day online course-- I'm just going to provide it to you to simplify the process (plus, it's simple to solve and most statistical packages provide it as a matter of course, so almost never do you solve for it by hand). Let's say that the Sm = 1.05.

Ok, so here's what we do--

We solve for t like this:

t = 2 - 6/1.05

t = -4/1.05

t = 3.81 (we always take the absolute value)

This value, in and of itself, tells us nothing.

Just like chi-square, however, we have to be concerned about degrees of freedom (df).

The df for a t-test is simple-- you take the N for the group and subtract one. So, for the first group, the df is 9 (10-1), and for the second group it's also 9 (10-1).

9 + 9 = 18, so the df = 18.

So, now armed with this info, we can check out the t-value chart (either the one in the book, or one we find easily online-- like here for example-- t-test table), and we see that in order to be significant with 18 df (at the .05 level), the t-value needs to be greater than 2.10.

Since our t-value of 3.81 is higher than 2.10, we say that there is a significant difference between the groups.

We then look back at our original data and we see that the traditional kids scored, on average, much better than the TV kids, so we conclude that it's better to use the traditional method.

(Typically, in experimental settings, we never make such a claim without several iterations of the experiment).

Get it?

Chi-square overview

When we talk about inferential statistics, we're simply determining whether or not the results we obtained were due to chance, or not due to chance.

(Inferential means that, if our sample is representative, we can INFER from our sample that the results are indicative of the entire population).

If they are not due to chance, we suggest a relationship between variables. If it is due to chance, we can not make such a claim.

Think of inferential stats as a light switch-- it's either on or off. In the social sciences, if the significance is .05 or lower (that is, we allow for 95% confidence), then we say the switch is "on" and the results are "significant"-- meaning that we are 95% sure that the results are NOT due to chance.

If p (the probability that the results would show up like this by chance) is HIGHER than .05, then we say there is NO significance-- which means we can't argue that the variables are related.

Well, we already know that if we have a nominal IV and a nominal DV, then we test using chi-square.

Chi-square is a simple statistical test.

Put simply, chi-square is the sum of the observed frequency minus the expected frequency, squared-- divided by the expected frequency.

The observed frequency is simply the number reported. The expected frequency is what you'd expect if it were completely by chance.

It's best explained with an example.

Suppose we had the following data:

We asked 97 people about their political affiliations (let's assume it's a random sample) and we got this:

Gender-------Republican------ Democrat----- Row Total

Male:-----------  23---------------- 17------------ 40 

Female:--------- 20---------------- 37------------ 57

Column Total:---43----------------54------------ 97


Our hypothesis is:

H1: Women are more likely to be affiliated with the Democratic party than men.

By looking at the raw data, it's difficult to say, with certainty, that this is the case, so we test the hypothesis using chi square.

Our first order of business is to find the "expected" frequency.

The "expected" frequency is R x C / N (where R is the ROW total; C is the COLUMN total; and N is the overall number.

So the ROW total for men is 40.
The ROW total for women is 57.

The Column total for Republicans is 43.
The Column total for Democrats is 54.

The overall N is 97.

The expected frequency for males who should be Republicans based on chance is 40 x 43, which is 1,720 / 97 = 17.73.

Ok, so now we know that the observed frequency for men who are Republicans is 23, and the expected frequency is 17.73. This gives us a difference of (5.27). We square this value and get 27.77.

Once we have that value, we divide by the expected value and get this-- 27.77/17.73 = 1.57 (we always round to the nearest hundredth).

Remember, though, chi-square is the SUM OF, so we have compute it for each cell.

So, we repeat the process for each "cell" and then add up the totals.

Once we have the sum of the chi-squares, we check with a chi-square chart to see if it's significant at the .05 level. You can check here-- chi-square chart.

You'll note something called "degrees of freedom," or "df." A df helps us to determine what line to look at on the chart. The easiest way to remember df is this-- it's (R-1) x (C-1) where R is the number of rows and C is the number columns. In this case, we have 2 rows and 2 columns, which gives us a df of 1 because (R-1) = (2-1), and (C-1) = (2-1), and 1 x 1 = 1.

Also, remember that we use the .05 level of significance.

Go ahead and solve for the sum of chi-square and post what you get. Tell me if it's significant and whether the hypothesis is supported.

Jack

Intro to Statistics (Overview)

Statistics are mathematical methods to collect, organize, summarize, and analyze data. Statistics provide valid and reliable results only when the data collection and research methods follow established scientific procedures. With the development of the computer, the science of statistics has changed dramatically.

Basic statistical procedures include:

descriptive statistics,
sample distribution, and
data transformation.

In descriptive statistics, the chapter presents the concept of data distribution, frequency distribution, cumulative frequency, histogram, bar chart, frequency polygon, normal curve, and skewness.

Summary statistics make data more manageable by measuring two basic tendencies of distributions: 1) central tendency; and 2) dispersion (variability). These statistics make it easier for researchers to understand data.

Central tendency statistics provide information about the grouping of numbers in a distribution by giving a single number that characterizes the entire distribution. Using the mode, median, and mean, researchers can figure out a typical score of a distribution.

In addition, dispersion measures describe the way scores are spread out about a central point. Using range, variance, and standard deviation, allows researchers to understand the characteristics of the data.

The term sample distribution—the distribution of some characteristic measured on an individual or other unit of analysis that were part of a sample. Additionally, it's important to understand the notion of a sampling distribution—a theoretical probability distribution of all values of a variable for a given sample size.

Most statistical procedures are based on the assumption that the data are normally distributed. When some anomalies arise, researchers can attempt to transform the data to achieve normality. Data transformation can be possible by multiplying or dividing each score by a certain number, or taking the square root or log of the scores.

Hypothesis Testing Overview

Hypothesis development in scientific research is important because the process refines and focuses research by excluding extraneous variables and permitting variables to be quantified.

Scientists rarely begin a research study without a problem or a question to test. Without research questions or hypotheses, research proves to be a waste of time.

Researchers develop studies based on existing theory and are thus able to make predictions about the outcome of their work. Therefore, hypothesis development is usually the culmination of a rigorous literature review (we don't have the time to conduct a full-scale literature review in this class, so our theories will be based on our own experiences).

Researchers should use hypotheses in scientific research to:

1) provide direction for a study;
2) eliminate trial-and-error research;
3) rule out intervening and confounding variables; and,
4) allow for quantification of variables.

In addition, hypotheses should be:

1) compatible with current knowledge in the area;
2) logically consistent;
3) stated concisely; and,
4) testable.


In hypothesis testing, a researcher either rejects or fails to reject the null hypothesis that the statistical differences being analyzed are due to chance or random error.

To determine the statistical significance of a research study, the research must set a probability level (significance level) against which the null hypothesis is tested. If the results of the study indicate a probability lower than this level, the researcher can reject the null hypothesis. If the research outcome has a high probability, the researcher fails to reject the null hypothesis. It is common practice in mass media research studies to set the probability level at .05, which means that either one or five times out of 100, significant results of the study occur because of random error or chance. Another way to think of this is to say that "we are 95% confident that our results are not due to chance."

All research contains error. Typically, two types of error (Type I error: the rejection of a null hypothesis that should not be rejected, and Type II error: the acceptance of a null hypothesis that should be rejected) are relevant to hypothesis testing.

There is always the possibility of making an error in rejecting or failing to reject a null hypothesis. It is not easy for researchers to balance these two error types, but one procedure, power analysis, helps researchers deal with the problem. Because power (the probability of rejecting the null hypothesis when it is true) indicates the probability that a statistical test of a null hypothesis will result in the conclusion that the phenomenon under study actually exists, if there is a difference, researchers are able to detect it.

Sunday, March 22, 2015

Exam #2 Date: April 6, 2015

The date of the second exam is April 6, 2015.

It will be an open-book, open-resource exam.

Jack


Big Data Show that Nepotism Rules in America

THERE is a very real chance that the presidential election in 2016 will pit Jeb Bush against Hillary Clinton. According to oddsmakers, this is the likeliest outcome.
Many Americans are uncomfortable with the idea that two families could dominate the presidency that way. Whether or not you like one of the candidates, it just doesn’t feel right, in part because a second Bush-Clinton election makes a mockery of our self-identification as a democratic meritocracy.
How bad is America’s nepotism problem? Can data science help us gauge its depth? It can — and what the data shows is that something has gone haywire.
I studied the probability of male baby boomers’ reaching the same level of success as their fathers. I had to limit myself to fathers and sons because this was a highly sexist period in which women held few powerful political positions.
Let’s start with the presidency. Thirteen sons of presidents were born during America’s baby boom. One of the 13 became president himself, of course, and Jeb would make a second. Of the roughly 37 million boomer males who weren’t born to a president, two won the White House. Maybe it’s an anomaly that George W. Bush became president in 2001, but his advent means that in our era a son of a president was roughly 1.4 million times more likely to become president than his supposed peers.
The presidency is obviously a small sample. But the same calculations can be done for other political positions. Take governors.
Because it is difficult to be sure that you have counted all the sons of governors, let’s assume that governors reproduce at average rates. This would mean there were about 250 baby boomer males born to governors. Five of them became governors themselves, about one in 50. This is 6,000 times the rate of the average American. The same methodology suggests that sons of senators had an 8,500 times higher chance of becoming a senator than an average American male boomer.
There is some evidence that the parental advantage in politics is actually getting bigger. George W. Bush ended a 171-year drought for presidential sons. From 2003 to 2006, the Senate had the highest percentage of senators’ children — six — in its history.
Continue reading the main story

Thanks, Dad!

Thirteen sons of presidents were born during America's baby boom. One, George W. Bush, also became president. Below are the odds that a boomer man matched his father's achievement — compared to the odds for the average male boomer.
BILLIONAIRES
1 in
9
1 in
258,141
followed
dad’s
footsteps
average
boomer
men
Ross Perot, Ross Perot Jr.
PRESIDENTS
1 in
13
1 in
18,715,250
George W. Bush, George Bush
SENATORS
1 in
47
1 in
398,197
Al Gore Sr., Al Gore Jr.
GOVERNORS
1 in
51
1 in
306,807
Mitt Romney, George Romney
M.L.B. PLAYERS
1 in
73
1 in
14,966
Barry Bonds, Bobby Bonds
N.F.L. PLAYERS
1 in
113
1 in
7,220
Dave Shula, Don Shula (shown later as coaches)
Is this electoral edge unusual? Successful parents, whatever their occupation, pass on their genes and plenty of other stuff to their kids. Do different fields have similar familial patterns?
In just about every field I looked at, having a successful parent makes you way more likely to be a big success, but the advantage is much smaller than it is at the top of politics.
Using the same methodology, I estimate that the son of an N.B.A. player has about a one in 45 chance of becoming an N.B.A. player. Since there are far more N.B.A. slots than Senate slots, this is only about an 800-fold edge.
Think about the N.B.A. further. The skills necessary to be a basketball player, especially height, are highly hereditary. But the N.B.A. is a meritocracy, with your performance easy to evaluate. If you do not play well, you will be cut, even if the team is the New York Knicks and your name is Patrick Ewing Jr. Father-son correlation in the N.B.A. is only one-eleventh as high as it is in the Senate.
The parental edge in football and baseball is much lower than it is in basketball, probably because there is less reliance on height.
I went through a wide range of fields and found a consistent pattern: greater success for the sons, but nothing like the edge a winning politician provides.
Here is the estimated parental edge for other big American prizes and positions. An American male is 4,582 times more likely to become an Army general if his father was one; 1,895 times more likely to become a famous C.E.O.; 1,639 times more likely to win a Pulitzer Prize; 1,497 times more likely to win a Grammy; and 1,361 times more likely to win an Academy Award. Those are pretty decent odds, but they do not come close to the 8,500 times more likely a senator’s son is to find himself chatting with John McCain or Dianne Feinstein in the Senate cloakroom.
THE Bush story is also telling, when we compare it to familial success in other fields.
Has any modern family dominated a meritocracy the way that the Bushes dominate politics? I could not find one. The Mannings, in football, probably come closest. But while Archie Manning, the father of two Super Bowl-winning quarterbacks, Peyton and Eli, was a solid N.F.L. player, he was hardly the football equivalent of a president.
Internationally, the greatest father-son, merit-based, same-field accomplishment is probably Niels Bohr’s son Aage matching his father’s Nobel Prize in Physics. But neither the Bohrs nor the Mannings dominated physics or football the way the Bush family dominates American politics.
Regression to the mean limits family dominance in any meritocratic field. If you have a well-above-average dose of a trait, you can expect your child to be closer to average.
Regression to the mean is so powerful that once-in-a-generation talent basically never sires once-in-a-generation talent. It explains why Michael Jordan’s sons were middling college basketball players and Jakob Dylan wrote two good songs. It is why there are no American parent-child pairs among Hall of Fame players in any major professional sports league.
The Bush family’s dominance would be the basketball equivalent of Michael Jordan being the father of LeBron James and Kevin Durant — and of Michael Jordan’s father being Walt Frazier.
In other words, it is virtually impossible, statistically speaking, that Bushes are consistently the most talented people to lead our country. Same for Chelsea Clinton or any other member of a political dynasty thought to be possible presidential timber.
Politics is not the absolute worst field in giving an advantage to certain families. In my research, I found two fields with a bigger family edge.
First is billionaires. According to my calculation, you have about a 28,000 times higher chance of being a billionaire if your father was a billionaire. And billionaires like the Waltons or the Rockefellers before them probably dominate American wealth more than the Bushes dominate American politics.
These billionaires, of course, have inherited their status, not earned it. Call me jaded, but it seems to me that most heirs to billionaires don’t do much more than marry nice-looking people and take sports franchises that I support and run them into the ground.
The second group is reality TV stars. You have about a 9,300 times edge in becoming a reality television star if your father is one. But this is precisely because some of these shows star famous people’s families.
We should not take this criticism too far. In 2008, the United States chose the mixed-race son of a Kenyan and a Kansan to be president. More than 90 percent of senators had parents who weren’t top politicians. And political campaigns can be unpredictable. For all we know, the 2016 election could be fought between Senator Elizabeth Warren, daughter of a janitor, and Gov. Scott Walker, son of a minister.
Unless of course the Democratic candidate is Andrew M. Cuomo, son of a governor, and the Republican candidate is Rand Paul, son of a congressman.
There are plenty of countries that are worse. Over the past 50 years, being the son of a leader of North Korea increased your probability of being a leader of North Korea by a factor of infinity. An infinite advantage to having a powerful father has been common in human history.
But Big Data allows us not just vague comparisons to other countries or time periods. We can see precisely how much families dominate in many different spheres and we can see what true meritocracies look like. The data shows conclusively that we have a nepotism problem. So now the question is: Why does the modern United States tolerate this level of privilege for political name brands?

Monday, March 16, 2015

Sample Research Projects

Music and Activities

https://drive.google.com/file/d/0B2Dv4MQMdbN4T05wQUpTSXdqN28/view?usp=sharing

Music and Social Media

https://drive.google.com/file/d/0B2Dv4MQMdbN4TFJzYXFaMEVsekE/view?usp=sharing

Stress and Television Habits

https://drive.google.com/file/d/0B2Dv4MQMdbN4aS16ZUZJUnRsd1U/view?usp=sharing

Four Criteria Used to Evaluate Qualitative Research

https://drive.google.com/file/d/0B2Dv4MQMdbN4c2pvakk0dmkwYk0/view?usp=sharing

Nature of Qualitative Research (Qualitative vs. Quantitative)

The Nature of Qualitative Research

All research, whether quantitative or qualitative, must involve an explicit, disciplined, systematic approach to finding things out, using the method most appropriate to the question being asked. Consideration should be given to these common goals, although the differences between qualitative and quantitative research have often been exaggerated in the past.  The table below summarizes some of the ways in which qualitative and quantitative research do differ:

Table 1
Qualitative research
Quantitative research
tends to focus on how people or groups of people can have (somewhat) different ways of looking at reality (usually social or psychological reality)
tends to focus on ways of describing and understanding reality by the discovery of general “laws”
takes account of complexity by incorporating the  real-world context – can take different perspectives on board
takes account of complexity by precise definition of the focus of interest and techniques that mean that external “noise” can be discounted
studies behaviour in natural settings or uses people’s accounts as data; usually no manipulation of variables
involves manipulation of some variables (independent variables) while other variables (which would be considered to be extraneous and confounding variables) are held constant
focuses on reports of experience or on data which cannot be adequately expressed numerically
uses statistical techniques that allow us to talk about how likely it is that something is “true” for a given population in an objective or measurable sense
focuses on description and interpretation and might lead to development of new concepts or theory, or to an evaluation of an organisational process
focuses on cause & effect - e.g. uses experiment to test (try to disprove) an hypothesis
employs a flexible, emergent but systematic research process
requires the research process to be defined in advance


Qualitative research is concerned with developing explanations of social phenomena.  That is to say, it aims to help us to understand the social world in which we live and why things are the way they are.  It is concerned with the social aspects of our world and seeks to answer questions about:

·    Why people behave the way they do
·    How opinions and attitudes are formed
·    How people are affected by the events that go on around them
·    How and why cultures and practices have developed in the way they have

In a health or social care setting, qualitative research is particularly useful where the research question involves one of the situations below and people’s experiences and views are sought:

·   exploration or identification of concepts or views
·   exploration of “implementability”               
·   the real-life context
·   sensitive topics where flexibility is needed to avoid causing distress



In the past the distinguishing features of qualitative and quantitative research have been used as criticisms by proponents of the “other” methodology. For example, one common criticism leveled at qualitative research has been that the results of a study may not be generalizable to a larger population because the sample group was small and the participants were not chosen randomly.  However if the original research question sought insight into a specific subgroup of the population, not the general population, because the subgroup is “special” or different from the general population and that specialness is the focus of the research, the small sample may have been appropriate. This would be the case with some ethnic groups or some patient groups suffering from rare conditions, or patient or health care groups in particular circumstances. In such studies, generalizability of the findings to a wider, more diverse population is not an aim. Another example is the label of reductionism, based on the requirement of the experimental method to eliminate all but one measurable variable, which is used to imply criticism of quantitative methodology. The rigour involved in a well designed and executed experiment is a strength of quantitative research just as an alternative approach which engages with context is a strength of qualitative methodology.

Qualitative Data Collection Methods

Qualitative Data Collection Methods

In this section, methods of qualitative research data collection are outlined.  The main methods are:

1)    interviews
2)    focus groups
3)    observation
4)    collection of documented material such as letters, diaries, photographs
5)    collection of narrative
6)    open ended questions in questionnaires (other aspects of are covered in the resource pack surveys and questionnaires )

Interviews
Interviewing can, at one extreme, be structured, with questions prepared and presented to each interviewee in an identical way using a strict predetermined order.  At the other extreme, interviews can be completely unstructured, like a free-flowing conversation. Qualitative researchers usually employ “semi-structured” interviews which involve a number of open ended questions based on the topic areas that the researcher wants to cover.  The open ended nature of the questions posed defines the topic under investigation but provides opportunities for both interviewer and interviewee to discuss some topics in more detail.  If the interviewee has difficulty answering a question or provides only a brief response, the interviewer can use cues or prompts to encourage the interviewee to consider the question further.  In a semi structured interview the interviewer also has the freedom to probe the interviewee to elaborate on an original response or to follow a line of inquiry introduced by the interviewee.  An example would be:

Interviewer:   "I'd like to hear your thoughts on whether changes in government policy have changed the work of the doctor in general practice.  Has your work changed at all?" 

Interviewee:  "Absolutely!  The workload has increased for a start."

Interviewer:   "Oh, how is that?"

Preparation for semi-structured interviews includes drawing up a “topic guide” which is a list of topics the interviewer wishes to discuss. The guide is not a schedule of questions and should not restrict the interview, which needs to be conducted sensitively and flexibly allowing follow up of points of interest to either interviewer or interviewee. In addition to the topic guide, the interviewer will probably want to approach the interview with written prompts to him/herself in order to make sure that the necessary preliminary ground is covered concerning such things as the information leaflet (has the interviewee understood it and got any questions?), the consent form (has it been signed?), the voice recorder (is it switched on?). The semi-structured interview is possibly the most common qualitative research data gathering method in health and social care research as it is relatively straightforward to organize. That does not however mean that it is easy to conduct good qualitative research interviews. A good interviewer needs to be able to put an interviewee at ease, needs good listening skills and needs to be able to manage an interview situation so as to collect data which truly reflect the opinions and feelings of the interviewee concerning the chosen topic(s). A quiet, comfortable location should be chosen and the interviewer should give consideration to how s/he presents her/himself in terms of dress, manner and so on, so as to be approachable. Most commonly interviews are audio recorded. Digital voice recorders are excellent for this and easier to use and less intrusive than tape recorders. Interviews may also be video-taped if details such as non-verbal signals are needed for the analysis. In practice it may be more difficult to obtain the approval of the relevant ethics committee(s) for video-recording and it may be more difficult to get consent from interviewees.  A form of interview can be conducted by email. This will generate qualitatively different types of response from participants partly because they are able to delay responding until they have thought about what to say. Interesting research is being carried out on the special features of email communications.

As with all other research (qualitative and quantitative), audit trails are good practice. Therefore, a reflexive diary should be kept by the researcher. Part of this should take the form of field notes and it is good practice to enter observations and impressions about each interview into a notebook as soon as possible after the interview has taken place.


Focus Groups
In a way focus groups resemble interviews, but focus group transcripts can be analysed so as to explore the ways in which the participants interact with each other and influence each other’s expressed ideas, which obviously cannot happen with one-to-one interview material. In common with semi-structured interviews, focus group conveners use topic guides to help them keep the discussion relevant to the research question. Focus groups are not necessarily a cheaper and quicker means to an end than are interviews, as focus groups may be more difficult to manage and more difficult to convene simply because more people are involved. Focus groups are considered to work well with approximately 8 people, but this is not always easy to arrange – do you invite more in the expectation that one or two will not turn up? If so, how do you manage if 10 or 12 present themselves? or if not, what if only 3 or 4 turn up (as a courtesy to them you will probably have to proceed)? For issues concerning sampling and constitution of focus groups, see Section 5. Focus groups are ideally run in accessible locations where participants can feel comfortable and relaxed. The time of day and facilities offered will need to be appropriate for the particular target member: for example is a crèche needed? Is there adequate car-parking space? It is better if the discussion is not interrupted and so it is a good idea to offer refreshments and to point out toilet facilities beforehand. Serving refreshments as people arrive also serves as a good “ice-breaker” and allows participants to meet each other before the focus group starts.

An important preliminary for conducting focus groups is laying down the “ground rules”. One of these concerns confidentiality, and this needs careful planning at the proposal and ethics committee application stage. Members of a focus group may not speak openly unless they are comfortable that others present will treat their contributions as confidential. It could be laid down as a condition of the focus group that it is expected that the content of the discussion which is about to take place will only be known by those present. All participants should indicate their agreement to this. Alternatively, if this seems unrealistic, the facilitator could point out that there are ways of presenting ideas that avoid breaching confidentiality: for instance, a participant can say “I have heard on the grapevine that ‘x’ sometimes happens” rather than saying “‘x’ has happened to me”, and that participants might adopt this policy.

Acting as facilitator of a focus group, the researcher must allow all participants to express themselves and must cope with the added problem of trying to prevent more than one person speaking at a time, in order to permit identification of the speakers for the purposes of transcription and analysis. This is something else which should be requested when laying down the “ground rules”. Unless the proceedings are being videoed, it is a good idea to have an observer present. This person’s role could be to note which participant is saying what, which can be done if each person is labelled with a number or letter and the relevant label is noted alongside the first word or two of his/her contribution. Another point to make clear at the outset is the planned completion time for the discussion.

Observation

Not all qualitative data collection approaches require direct interaction with people. Observation is a technique that can be used when data cannot be collected through other means, or those collected through other means are of limited value or are difficult to validate. For example, in interviews participants may be asked about how they behave in certain situations but there is no guarantee that they actually do what they say they do. Observing them in those situations is more valid: it is possible to see how they actually behave. Observation can also produce data for verifying or nullifying information provided in face to face encounters.

In some research observation of people is not required but observation of the environment. This can provide valuable background information about the environment where a research project is being undertaken. For example, an ethnographic study of a children’s ward may need information about the layout of the ward or about how people dress. In a health needs assessment or in a locality survey observations can provide broad contextual descriptions of the key features of the area: for example, whether the area is inner city, urban or rural, the geographical location, the size of the population. It can describe the key components of the area: the main industries, type of housing. The availability of services can be identified: the number, type and location of health care facilities such as hospitals and health centers, care homes, leisure facilities, shopping centers.

Techniques for collecting data through observation:

Written descriptions. The researcher can record observations of people, a situation or an environment by making notes of what has been observed. The limitations of this are similar to those of trying to write down interview data as an interview takes place. First there is a risk that the researcher will miss out on observations because s/he is writing about the last thing s/he noticed. Secondly, the researcher may find her/his attention focusing on a particular event or feature because it appears to be particularly interesting or relevant and miss things which are equally or more important but their importance is not recognized or acknowledged at the time.

Video recording. This frees the observer from the task of making notes at the time and allows events to be reviewed repeatedly. One disadvantage of video recording is that the actors in the social world may be more conscious of the camera than they would be of a person and that this will affect their behavior. They may even try to avoid being filmed. This problem can be lessened by having the camera placed at a fixed point rather than being carried around. In either case though, only events in the line of the camera can be recorded, limiting the range of possible observations and perhaps distorting conclusions.

Artifacts. Artifacts may be objects which inform us about a phenomenon under study because of their significance to the phenomenon. Examples would be doctors’ equipment in a particular clinic or art work hung in residential care homes.



Collection of Documented Material such as Letters, Diaries, Photographs

Documentation. A wide range of written materials can produce qualitative information. These can be particularly useful in trying to understand the philosophy of an organisation as may be required in ethnography. They can include policy documents, mission statements, annual reports, minutes of meetings, codes of conduct, web sites, series of letters or emails, case notes, health promotion materials, etc. Diary entries may be used retrospectively (it is reasonable to assume that diarists will enter things which were important to them at the time of the entry) or diaries may be given to research participants who are asked to keep an account of issues or their thoughts concerning diet, medication, interactions with health care services or whatever is the subject of the research. Audio diaries may be used if the written word presents problems. Notice boards can also be a valuable source of data.

Photographs are a good way of collecting information which can be captured in a single shot or series of shots. For example, photographs of buildings, neighborhoods, dress and appearance could be analyzed in such a way as to develop theory about professional relationships over a given time period. Photographs may be produced for research purposes or existing photographs may be used for analysis. As with every method of data collection, any ethical implications of collecting documents should be considered.

Collection of Narrative
A story told by a research participant, or a conversation between two or more people can be used as data for qualitative research (see Section 3). Data collected should be entirely naturally occurring, not shaped as in a semi-structured interview or focus group. Narrative data can however be collected in the course of a form of interview. The “narrative interview” begins with a “generative narrative question” which invites the interviewee to relate his/her account of his/her life history or a part of it. This could be an account of living with a chronic illness or with a child with special needs or as a carer for an elderly relative. During the first part of the interview, the interviewee should listen actively but should not interject with further questioning. When the narrator indicates that the narrative is completed, there follows a questioning phase where the interviewer elicits further information on fragments which have been introduced. This may be followed by a balancing phase where first “how” and then “why” questions are asked in order to gain further explanation of aspects of the narrative.

Open ended questions in questionnaires 



Open ended questions, responses to which are to be analyzed qualitatively, may be included in questionnaires even though the majority of the questionnaire will generate quantitative data. The open ended questions usually require that responses, which reflect the opinions of the respondents, be written in blank spaces. This form of data may give useful guidance to a researcher planning an interview or focus group study. The outcome by itself may be a source of frustration as there is no opportunity to ask for clarification of any point made.