DATA ACCESS AND REPLICATION INSTRUCTIONS FOR "ENTREPRENEURSHIP AND THE BUSINESS CYCLE" 
by Philipp Koellinger and Roy Thurik

The three data sources that we used are publically available:

(a) OECD data for annual real GDP in constant 2000 prices in national currencies and standardized unemployment rates (extracted 04/21/2010).
(b) Compendia 2007.1 data on harmonized shares of business owners in the total labor force, available at http://www.ondernemerschap.nl (extracted 03/03/2009).
(c) Global Entrepreneurship Monitor data on nascent entrepreneurial activity, available at http://www.gemconsortium.org/ (extracted 17/02/2009). Country averages were computed from the individual-level data, using the methods described in Reynolds et al. (2005) and Koellinger (2008).

The data that can be extracted from these three sources tends to change slightly over time, partly due to the availability of new information. We included the most recent data that we had available for our calculations in the files below.

Content of files:
(1) "Country_weights_by_GDP.xls" contains raw GDP data and the country weights that were computed from these data. The country weights are then used in (2) to compute the "World business cycle" data (3).
(2) "BC_panel.dta" contains the raw data per country and year, the detrended data, and the country weights computed in (1). 
(3) "World_business_cycle.dta" contains the aggregated detrended data that were used to compute Tables 2 and 3.
(4) "Table_1_Country_series_correlations_summary.xls" contains the cross-country year-by-year correlations of the detrended GDP, unemployment and self-employment data that were used for Table 1.
(5) "Generation_avg_weighted_agg_dataset.do" contains the Stata commands that were used for generating the world business cycle data (3).
(6) "World_BC_Granger_causality_tests.de" is the Stata code used to compute Table 3, using dataset (3).
(7) "Country_level_VARs.do" is the Stata code used to compute Table 4, using dataset (2).
(8) "Dynamic_panel_estimations.do" is the Stata code used to compute Table 5 and the related robustness checks, using dataset (2).
(9) "Inno_opport_analyses.do" is the Stata code used to compute Table 6, using dataset (2).
(10) "JMulti_project.jsc" is the JMulti version of (3) and (6).
 

Remarks:
- "World business cycle" VAR analyses (Table 2) and IRFs (Figures 2 & 3) were calculated with JMulTi http://www.jmulti.de/ due to the larger number of model specification statistics that are available in JMulTi compared to Stata.
- All other analyses, including the Granger causality test in Table 3, were carried out in Stata. The coefficients in Table 2 are identically estimated in Stata and JMulTi.
- HP-filtered data were calculated using the hprescott command in Stata 11.