RESEARCH · PUBLISHED ARTICLE
Agricultural Policy Challenges and Legislation in the Early Kádár Era
Text analysis and text mining of historical sources
How is the economic policy of the early Kádár era reflected in legislative output? The study examines the second wave of Hungarian collectivisation using both a social-science classification method (the CAP codebook) and a text-mining procedure (topic modelling) — and then combines the two to show what digital text analysis can and cannot offer historians working in the Hungarian context.
CITATION
Ring, O. – Kiss, L. (2020). Agrárpolitikai kihívások és jogszabályalkotás a korai Kádár-korban: Történeti források szövegelemzése és szövegbányászati vizsgálata. Digitális Bölcsészet 3, 37–61. DOI: 10.31400/dh-hun.2020.3.1030
01
THE QUESTION
The second wave of collectivisation and legislation
The economic-policy stake of the period
The years between 1957 and 1962 were decisive for the Kádár regime: collectivisation, suspended after 1956, was completed within a remarkably short period, and the agricultural sector was reorganised on a large scale. The legislative output of these years preserves the political will, the technical detail and the social-policy compromises that accompanied the second wave of collectivisation.
The research question
What does the legislation of the period reveal about how the regime understood and managed the agricultural transformation? The study reads the Acts and law-decrees not as neutral instruments but as politically saturated texts whose thematic emphasis, internal weighting and structural choices document the regime’s evolving priorities. The aim is to map the agricultural-policy agenda of the early Kádár era both quantitatively and interpretively.
The source base
The corpus comprises all agricultural-policy-relevant Acts and law-decrees passed between 1957 and 1962, harvested from the official legal database and cleaned for analysis. Each text was retained in its full legal form so that thematic content, internal structure and stylistic features could all be compared across the corpus.
02
METHODOLOGY
Two methods, side by side — and combined
Classification with the CAP codebook
The first method is a manual top-down classification using the Comparative Agendas Project’s 21-category master codebook. Each Act was coded for its primary policy topic — agricultural production, land tenure, cooperatives, prices and wages, taxation, and so on — producing a structured agenda dataset that can be compared with CAP series from other periods and countries. CAP coding makes the agenda visible in a categorically transparent way.
Topic modelling (LDA)
The second method is bottom-up: a Latent Dirichlet Allocation (LDA) topic model is fitted to the corpus, allowing thematic structures to emerge from word co-occurrence patterns rather than from prior categorisation. Topic modelling does not “know” what a policy topic is; it only knows which words tend to occur together. Comparing the resulting topics to the CAP categories surfaces both alignments and surprises — themes the codebook misses, and combinations that the codebook splits apart.
Bringing the two methods together
The methodological contribution lies in combining the two layers: each Act has both a CAP code and a topic-model fingerprint, and the cross-tabulation of the two reveals where the structured frame agrees with the data-driven frame and where they diverge. The combined picture is richer than either method alone — and it makes the methodological commitments of each visible to the reader.
03
RESULTS
What do the numbers tell us about Kádár-era agricultural policy?
The agricultural-policy emphases of the period
The CAP-coded results show that the regime concentrated its legislative attention on three closely related areas: the institutional framework of cooperatives, agricultural production and land use, and the prices, wages and taxation that connected agricultural producers to the wider economy. Social policy, environmental protection and consumer-related categories appear only marginally — the agenda is highly production-oriented, with the state primarily acting as restructuring agent rather than as redistributive welfare state.
The “discovered” themes of the topic model
The topic model surfaced several themes the CAP codebook does not isolate cleanly. The most striking is a coherent cluster of texts dealing with the legal-administrative reorganisation of cooperative property — a hybrid theme that crosses CAP categories. A second is a set of compensation, indemnity and personal-property texts that the CAP coding distributes across categories but the topic model groups together. These “discovered” themes corroborate well-known historiographical readings of the period and add quantitative texture to them.
The two methods overlaid
Cross-tabulating the CAP codes with the LDA topics produces a high-resolution picture of the agricultural-policy agenda. Some CAP categories map cleanly onto a single topic; others fragment across topics, signalling internal heterogeneity that the codebook alone would conceal. Overall, the two methods agree in their broad strokes and disagree in their fine grain — and the disagreements are the most informative findings, because they mark exactly the places where prior categorisation and corpus structure diverge.
Possibilities and limits of digital text analysis in historical research
The study closes with reflection on what computational text analysis can and cannot offer Hungarian historiography. Topic modelling is a powerful exploratory tool, but its output is sensitive to corpus size, parameter choices and pre-processing decisions; CAP coding gives interpretable categories, but it imports a frame that may not fit autocratic systems perfectly. The two together — interpreted with historical knowledge — produce a defensible, replicable, transparent reading of an otherwise opaque policy archive. The methodological lesson is that digital text analysis is most useful when it is presented openly alongside, rather than instead of, traditional historiographical reading.