CEPII, Recherche et Expertise sur l'economie mondiale
TRADHIST


Description


This dataset has been built in order to analyze in historical perspective, using gravity equations, the globalization process through bilateral International trade data (see Fouquin and Hugot Working paper n°2016-13, Mai 2016). The data set gathers five types of variables: i) bilateral nominal trade flows, ii) country level aggregate nominal exports and imports, iii) nominal GDPs, iv) exchange rates, and v) bilateral factors that are known to favor or hamper trade, including geographical distance, common borders, colonial and linguistic links, as well as bilateral tariffs. We then use this dataset in order to estimate the evolution of trade costs over two centuries. Overall, the dataset we gather is unique in its temporal as well as geographical coverage: we gather more than 1.9 million bilateral trade observations for the 188 years from 1827 to 2014. We also provide about 42,000 observations on aggregate trade, and about 14,000 observations on GDPs and exchange rates respectively.

Please read the CEPII working paper. If you don't find the answer to your question, please contact us.


Reference document to cite: Fouquin, M. and Hugot, J. (2016) Two Centuries of Bilateral Trade and Gravity Data: 1827-2014. CEPII Working Paper, N°2016-14.

Person in charge & contact: Jules Hugot, tradehistcepii.fr

Licence: Etalab 2.0

Download
>>>


STATA 12.0 version
  • TRADHIST_WP.dta [Original data set, associated with the CEPII Working Paper]

STATA 13.0 version

Methodology
>>>


The data set has been built to explore the two modern waves of globalization: the First Globalization of the nineteenth century and the post-World War II Second Globalization. The dataset gathers five types of variables: i) bilateral nominal trade flows, ii) country-level aggregate nominal exports and imports, iii) nominal GDPs, iv) exchange rates, and v) bilateral factors that are known to favor or hamper trade, including geographical distance, common borders, colonial and linguistic links, as well as bilateral tariffs.2 This data is unique both in terms of temporal and geographical coverage.

We adopt a systematic approach to collecting all of this data, to the exception of tariffs. For each variable, we merge the pre-existing (secondary) sources with additional data directly extracted from primary sources, including government publications, books and academic articles. In the end, we gather more than 1.9 million bilateral trade observations for the 188 years from 1827 to 2014. We also provide about 42,000 observations on aggregate trade (i.e. total imports and exports at the country level), and about 14,000 observations on GDPs and exchange rates respectively. The country pairs for which we collected bilateral trade are also associated with the great-circle distance and the shortest maritime distance between them, as well as with a set of dummy variables reflecting colonial and linguistic links. Nominal values have been systematically converted to the British pound sterling in order to make data internationally comparable.

Our most significant contribution is to expand the available bilateral trade data, both in terms of temporal and geographical coverage. Of the 1.9 million bilateral trade observations, more than 1.6 million concentrate on the 1948-2014 period; and about 97% of these observations comes from the Direction of Trade Statistics data set (International Monetary Fund, 2002 and 2015). Our contribution therefore concentrates on the period prior to 1948, for which we provide about 185,000 new observations, out of the 240,000 observations we report. For this period, we more than quadruple the amount of data available. The structure of the data is bilateral: each observation pertains to a specific trade flow in a given year. Each observation is therefore country-pair, direction, and year-specific. The resulting panel is highly unbalanced, as the number of available bilateral trade flows dramatically increases over time. The origin and the destination country are respectively identified by the iso_o and the iso_d variables. The monadic variables – that are only country and year-specific – are duplicated whenever needed and attached to the origin or the destination country accordingly. Variables pertaining to the origin (destination) country are identified by the suffix _o (_d). These variables include aggregate exports and imports, GDP and exchange rate.