1. The Model

The aim of this study is to gain a better understanding of the people behind the swing revival. What makes someone an enthusiast? While there will be factors that cannot be easily quantified which could negate the predictive elements of this model, I hope the answers obtained from this analysis will give some insight into this phenomenon.

I have chosen to focus my analysis on the factors that contribute to the number of swing camps attended, which I have chosen as my dependent variable because attendance of these camps demonstrate a greater degree of involvement in the swing community.

My hypothesis is that it is the people who attend the camps that make up the core of the swing revival. Further, that this dependent variable is directly influenced by the following factors - age, income, time and money spent on dancing, and degree of skill of the dancer.

2. Literature review

There are no official numbers or tracts documenting the growth of this resurgence of interest, but by observation during the period June to November this year, clubs in Los Angeles, San Francisco, Dallas, Chicago, Cleveland and New York, have had packed dance floors filled with young people twirling and jumping to the big band music dressed in vintage clothing of the 1930s, 40s and 50s.

Termed "neo-swing" this phenomena first started as a underground revival in the late 1980s to early 1990s and hit the mainstream when the Gap clothing store featured swing dancers in its TV commercial a year ago. Like most underground movements, the revival started out just a few enthusiasts grouping together to learn more about this dance which was danced to big band music and jump blues music in the 1930s and 40s. Coincidentally, interest in the dance which had its roots as a street dance sprang up in parts of Europe, especially Sweden, the UK and the US nearly all at the same time, sometime in the late 1980s and early 1990s.

The resurgence of interest in this dance form paralleled the development of the internet, which has allowed the swing community worldwide to communicate and propagate information using the web. There are over 200 swing websites world wide (possibly more). They provide information on events, shopping, contacts, classes and camps. Some are archives of information on films which feature the dance, the music, history, and even contain online dance lessons using computer animation. Information on the numerous dance camps can also be found on the web, locations as far flung as Herraeng, Sweden and for events as far ahead as December 31, 1999 for a new millennium swing dance party in Puerta Vallarta, Mexico.

3. Source of data

A survey, which I conducted by posting an electronic form on the web and directly by sending the questions by email. The responses came mostly from the USA (these include New York, Washington DC area, Boston, Philadelphia, Seattle, Portland, Los Angeles, San Francisco, Atlanta, Pittsburg, Salt Lake City, South Carolina, Texas and Virginia) and also included responses from Australia, Switzerland and Singapore.

I needed a survey that could be conducted quickly. So I designed one that could be accessed online via an electronic form (See Appendix). On November 29, I contacted 80 webmasters who run swing dancing web sites or who have email lists of dancers in their area. I also emailed copies of the form to people who contacted me personally to take part in this survey. Although I had over 200 responses in total, many left incomplete fields in the data form (especially continuous data needed for the regression analysis) which could not be used for this analysis. The final sample size: 116.

4. Independent and dependent variables - their significance

My hypothesis is that the dependent variable, number of camps attended, is affected by the independent variables listed below. A Pearson correlation was run between the variables (with continuous data) gathered in the survey. The following were chosen to be included in a multivariate regression analysis because they either presented continuous data, had a strong or moderate correlation between the pairs or were included because of a theoretical basis.

Dependent variable:

Number of times attended a swing dance camp ("Camps")

A feature of this sub-culture is that the more ardent practitioners of this dance attend swing camps. These are intensive dance training workshops which can last from a couple of days up to a week. I chose the number of dance camps attended as a dependent variable as the people who attend these camps form the core of "the scene" - they dress up in zoot suits and vintage dresses, dance in the swing clubs regularly and support the new swing bands that play in them.

Independent variables:

age in years ("Age")

The positive moderate correlation between Age and Camps is 0.261 (significant at 0.005 level, 2-tailed) suggests that age should be included as an independent variable. It makes sense theoretically as swing dancers have to be of a certain age to go to clubs and attend camps, and usually those slightly old will have had more time to master the skills sufficiently to attend the camps.

earnings in US dollars ("Income")

While there was close to no correlation between Camps and Income, I decided to include Income as a variable because I believe it has a positive effect on Camps. Attending the camps cost money - from $100 for a day to a few hundred dollars for a few days, and more if the participant has to travel out of state. Also, taking part in most swing activities require moderate income. This implies that the participants need a source of income.

time spent dancing each week in number of hours ("Time")

There was close to no correlation between Time and Camps but I felt this variable ought to be included because learning to dance takes practice and time spent at swing dances would also be a positive indicator of likelihood of attending the camps.

amount spent pursuing interest in dollars ("Spending")

The correlation between Spending and Camps was moderately positive at 0.324, (statistically significant at 0.000, 2-tailed). This merited its inclusion in the regression analysis for causal relationships. In addition it was also an indication of their willingness to spend money on this past time.

dance skill ("Novice", "Intermediate" or "Advanced")

As these camps tends to draw enthusiasts, it tends to be rather intimidating for rank beginners to want to attend them. I included degree of skill to be included as a dummy variable, with Intermediate dancers (the majority) as the control, as I believe those with more advanced skills are more likely to attend them and novices less likely to go. The correlation between Advanced skill dancers and Camps is moderately positive at 0.379 (statistically significant at 0.000, 2-tailed), and moderately negative between Novices and Camps at -0.236 (significant at 0.011 level, 2 tailed).

The relationship between the variables can be expressed in a regression equation in this way, where Camps is the dependent variable, and Age, Income, Time, Spending, Novice and Advanced the dependent variables:

Camps = b_{1} + b_{2}Age_{ }+ b_{3}Income_{ }+ b_{4}Time + b_{5}Spending + b_{6}Novice+ b_{7}Advanced_{ }

My hypothesis is that the independent variables have a causal link with the dependent variable and that the relationship is statistically significant. I expect age and income to have positive relationship with the number of camps attended. I also expect novices will not be as likely to attend camps as much as dancers with advanced skills.

This has to be compared with the null hypothesis, which in each case is that there is no relationship between the independent variable and the number of camps attended:

H_{null }: b_{2} = 0, b_{3} = 0, b_{4} = 0, b_{5} = 0, b_{6} = 0, b_{7} = 0

The alternative is that the independent variable has an effect (positive in all cases except Novice) on the dependent variable:

H_{alternative }: b_{2} > 0, b_{3} > 0, b_{4} > 0, b_{5} > 0, b_{6} <0, b_{7} > 0

**5. Results of statistical analysis**

A regression was run with results (see output in appendix) that allowed the formulation of this equation that will be used to explain involvement in the swing community by the number of camps attended:

Camps = -2.92 + 0.269Age* - 0.0000795Income* - 0.07501Time* + 0.00147Spending* -1.444Novice_{ }+ 3.12Advanced*

variables |
constant |
age |
income |
time |
spending |
novice |
advanced |

coefficient |
-2.92 |
.269 |
.0000795 |
.07501 |
.00147 |
-1.444 |
3.12 |

t- values |
-1.769 |
3.412 |
-3.403 |
-1.768 |
2.150 |
-1.533 |
3.347 |

p- values |
.080 |
.001 |
.001 |
.08 |
.034 |
.128 |
.001 |

*these results are statistically significant.

The tests showed that nearly all the variables were statistically significant at the 95% confidence interval.

With Age and Advanced, I could reject the null hypothesis, as the confidence interval for the coefficients did not have a zero value - (see output in Appendix). In fact we can be 99.999% confident in rejecting the null hypothesis for both of these variables.

With Income, Time, Spending and Novice I could not reject the null hypotheses as the confidence intervals for these variables could contain zero values (in the case of Income and Spending, the numbers were so small they tended to zero, but this makes it difficult to be certain of their effect). But I still kept them in my regression analysis as they were theoretically important to my hypothesis. Income had a bearing on means to attend camps, and Time and Spending on the propensity to put time and resources into attending camps. Novice was not even statistically significant, however, it is important as a comparison with the behavior of Advanced swing dancers.

The multivariate regression analysis indicates there may also be a degree of multicollinearity between Age and Income. This makes sense, as there is strong likelihood of a straight-line relationship between age and income at a certain age range. The correlation between age and income is relatively high at 0.742 (at a 0.000 significance) This would suggest that the variables move in tandem and would reduce their value as predictors of the variable Camps. Also when the regression was run taking out Age and Income in turn, the estimates changed. Indicating an instability associated with multicollinearity. However there is no evidence of other signs of multicollinearity - a high R^{2 } with not many statistically significant coefficients. The R^{2 } is low at 0.319, with five out of six variables statistically significant.

But the regression results showed that a number of my predictions were wrong. The negative coefficients for income and time run contrary to my prediction that they would influence the number of camps attended positively.

That the income coefficient should not be positive can be explained by the disproportionate number of students who responded to the survey. Most of them have little or no income yet 19 out of 32 (59%) still manage to go to the camps. A left-out variable could have been income from parental sources, student loans and part time jobs, which could have been left out of the survey. Also those in the higher income bracket do not necessarily go to these camps. That is reasonable as they may have less free time.

The absolute value of the income coefficient is also very small. Even though the p-value, which at 0.001 gives us 99.999% certainty that the coefficient is statistically significant, the smallness of the figure suggests that we can conclude that income is not substantively important. It is also difficult to determine if we can reject the null hypothesis, as the numbers in the confidence interval are so small. As they must be so small as to tend to zero, they cannot have any substantive importance.

The negative coefficient may be explained by the fact that this sample contains 29 (25%) novices who are less likely to attend camps but are likely to spend time at clubs or classes during the week. In any case, the small absolute value of the coefficient also suggests that it holds no substantive meaning, even though we can be 99.92% sure it is statistically significant.

Another problematic result is Time, which also carries a negative value coefficient. The fact that is also a small value, and the null hypothesis cannot be rejected makes it altogether a weak predictor. A reason for this may be attributed to the large number of novices (25%) who may be likely to spend time taking classes but not go to camps.

Spending is significant at the 0.034 level, but as the null hypothesis cannot be rejected, it is not certain if it will have an impact on the camps attended.

Using Intermediates as a control group, the dummy variables do show that relative to the control group, the Advanced group is more likely to go to swing camps, with a statistical significance of 0.002. Novices are less likely, but this is not a significant coefficient nor can the null hypothesis be rejected.

That constant -2.92 does not make sense is not surprising given the results. However as this model is not intended so much to predict a result, as to explain behavior, this does not matter so much.

6. Evaluation of the model and problems

The R^{2 }is interpreted as the percentage of variation in the dependent variable explained by the model. An R^{2} of 0.319 says that the model explains only 31.9% of the effect - the number of camps attended. While this is an acceptable outcome and implies the model captured a significant proportion of the variations on camp attendance it suggests that number of problems still remain to be dealt with.

For example, while there were a number of other independent variables that could have been included, they were left out because they were difficult to obtain or difficult to measure. For example, as the activity is a social one, one indication of involvement is the degree of social networks that have been built up within the community. That would encourage more participation in swing camps.

However, the subjectivity of measuring this was considered, and ultimately rejected for the survey. As it turned out even continuous variables were open to great deal of subjectivity, as some results were distinctly anomalous, with novices of less than six months experience attending 20 camps (this was rejected for missing data).

The fact that the survey was conducted online requires a discussion of whether the sample is reasonably representative of the swing community. The method leaves out a portion of the community and may bias the results. One statistic from this survey may throw some light on this. The respondents were asked how they received information about swing and which method they used most. In a sample of 225 (there was better data when they did not have to enter continuous variables). Their answers were:

- by word of mouth: 28%
- dance school or club I go to: 12%
- web sites: 36%
- email: 24%

It is not surprising that 60% of the respondents use the internet and email most, considering the survey was conducted through the web. But when 40% of the sample still do not use the internet principally for their source of information it suggests that this survey may not be truly representative of the dance community.

For example, what could be left out is a wider response from those who do not have internet connectivity in their jobs or at home or time to surf the net in their spare time. It also excludes an older section of the population which may not be as familiar using the internet. As age is an independent variable in this regression, this could affect the results. However, I suspect that this age range in this survey is fairly representative as the energetic nature of this dance tends to attract a relatively young crowd.

The results may also be biased as certain portions of the population with easy access to the internet are over-represented. For example, in the sample of 116, 32 (27%) are students, and 25 (21.5%) have jobs in the computer industry. There was a possibility is that the students may also all have come from the same school or community, accessing the information from one site. But this seems unlikely as the replies came from diverse locations all over the country on different days. Taking a wider sample using some of the data which had been rejected for regression analysis (sample size = 225) the results were still similar - 53 (24%) were students and 49 (22%) in the computer industry.

However the sample also showed a sex ratio that mirrors the real population statistics with 48.2% men, 51.8% women, despite the fact that the more than 20% respondents were in the computer industry and nearly all men. This also makes sense in a partner dancing community where equal numbers of both sexes are needed, The results also show that the majority were in their 20s, with 22 in their teens, 28 aged 20 to 24 and 29 aged 25 to 29 (49% in their 20s), 28 aged 30 to 39 (24%), 8 aged 40 to 49 (7%), and one aged 50 and above (less than 1%).

The question is whether those who were left out of this survey would behave very differently from those who responded to the survey. This is difficult to say from the present data.

7. Importance of results

The purpose of the regression model was to identify some of the key factors which affected the degree of involvement in the swing dancing community. This allows a better understanding of a recent social phenomena and how it develops.

The nature of the activity and the fact that it is a social one, also lends itself to a certain arbitrariness in the results, which are a function of the many factors that could influence a choice that is not determined by rational motivations but by pure whim.

Appendices

A. The survey questions :

Not all these questions were asked just for this project. Some questions were asked so that the data can be given to the swing dance community in return for their help in helping me conduct this survey. The questions can be found at my website at http:www/columbia.edu/~ll296/survey.html.

Age:

Gender:

Profession/Job:

Your yearly income in $US is:

City, State, Country:

Single or not: Single / Attached

How many hours per week do you devote to swing:

What is your swing dance skill level:

a) novice

b) intermediate

c) advanced

Before you took up swing which of the following did you do "regularly"...

(check all that apply):

a) dance (other forms of dance)

b) practiced martial arts

c) did gymnastics

d) take part in sports

e) was a couch potato

In your job you usually:

a) interact frequently with people

b) interact occasionally with people

c) do not meet people

As far as free time goes, how many hours per week do

you have?

How many swing camps or workshops have you attended?

Which of the following Swing paraphernalia have you purchased? Check all that apply:

a) t-shirts, swing clothes

b) videos

c) CDs

d) shoes

How much money have you spent on Swing in the last 6 months?

Pairs of two-tone swing shoes I own (Bleyers and otherwise):

Musical knowledge:

a) can sing or play a musical instrument

b) can’t sing or play but like listening

c) don’t listen much

Which swing style do you consider yourself? (check all that apply):

1) East Coast

2) West Coast

3) Western Swing

4) Lindy Hop

5) Others

Your Favorite Swing Albums include...(separate by comma):

Which source do you use most for getting information about swing?

1) word of mouth

2) email

3) websites

4) dance school or club I go to

How did you get into swing?