Modern Japanese Literary Studies

Intellectual History of the Field: Dissertations, Topics, and Institutional Networks (1915-2025)

6,875
Modern Dissertations
495
Institutions
20
Topic Clusters
2,316
Advisor-Advisee Pairs

📊 Dual-Source Topic Classification

All analytical visualizations now support source toggling between keyword-based and NLP-based topic classifications:

Keyword Approach
2,360 dissertations (34.3%)
Manually curated topic clusters
High precision, lower coverage
NLP Approach
6,097 dissertations (88.7%)
Semantic classification (Claude API)
Comprehensive coverage (+54.6 pp)

Use the [Keyword] [NLP] [Combined] buttons in each visualization to compare approaches.

Topic Diffusion Timeline

How did each topic spread through institutions over time? Animated scatter plot showing the geographic diffusion of 20 intellectual topics across 15 universities from 1950 to 2025. Toggle between keyword and NLP classifications to compare coverage patterns.

Open Visualization

Institutional Signatures

What are the intellectual signatures of the top 20 universities? Small multiples showing topic profiles across four time periods, revealing how each institution's focus has evolved. Compare keyword vs NLP institutional profiles.

Open Visualization

Advisee Drift Analysis

How much do advisees drift from their advisor and institution? Scatter plot mapping 2,316 advisor-advisee pairs by topic similarity, revealing loyalists, rebels, and independents. See how NLP changes similarity scores.

Open Visualization

Schools & Clusters

Are there distinct intellectual "schools" of institutions? Hierarchical clustering, heatmap, and network visualization revealing how universities group by topic similarity. Toggle sources to see different clustering patterns.

Open Visualization

Field Evolution

How have topics shifted across 1950-2025? Stacked area chart and bump chart showing the rise and fall of 20 intellectual topics over 75 years of dissertation production. Compare temporal trends across classification methods.

Open Visualization

Network Hierarchy Visualizations

Multiplex Network Explorer

Visualize the three network layers (advisor, review, citation) as separable planes. See which scholars bridge multiple networks, explore cross-layer connections, and filter by network membership with Venn diagram overlay.

Open Visualization

Intellectual Flow Sankey

Trace how topics flow through advisor lineages across generations, see which advisors produce which topics, track institutional knowledge exchange, and watch topic evolution over 75 years of the field. Four modes in one visualization.

Open Visualization

Radial Hierarchy Explorer

Select a founding scholar and watch their lineage radiate outward ring by ring. Explore generation depth, expand/collapse subtrees, animate growth over time, and trace lineage paths back to the roots of the discipline.

Open Visualization

Data Downloads

Keyword-based datasets:

NLP-based datasets:

Network & authority data:

Reports:

Methodology

Data source: 6,875 dissertations from ProQuest Dissertations & Theses Global, enriched with Academic Family Tree advisor-advisee data and JSTOR/OpenAlex citation data. Includes 4,376 Japanese-language dissertations from CiNii/NDL.

Literature filter: Dissertations are classified as Japanese literature (modern and premodern) using a three-stage process: subject code classification, text-based keyword matching against titles and abstracts, and manual review of ambiguous cases. This filter excludes non-literary disciplines (history, political science, anthropology, etc.) while retaining the full scope of Japanese literary studies.

Topic classification (Dual-source approach):

Keyword approach: 20 topic clusters defined by manually curated keyword lists, matched against full text of titles and abstracts. Scores normalized by word count. Coverage: 2,360 dissertations (34.3%).

NLP approach: Semantic topic classification using Claude Sonnet 4.5 API. Processes full titles and abstracts to assign themes, literary figures, periods, and genres. Coverage: 6,097 dissertations (88.7%), including 94.2% of Japanese-language CiNii dissertations (vs. 3.2% with keyword approach). Improvement: +54.6 percentage points overall coverage.

Source toggling: All analytical visualizations support switching between keyword-based, NLP-based, and combined topic classifications. Use the toolbar buttons to compare approaches and see how classification method affects patterns, trends, and institutional profiles.

Similarity: Cosine similarity between 20-dimensional topic vectors. Advisor vectors aggregate their advisees' dissertation topics. Institution vectors aggregate all dissertations at that institution.

Clustering: Complete-linkage agglomerative clustering of institutions by topic cosine distance, cut at 5 clusters.

Network metrics: PageRank (d=0.85, 50 iterations), HITS authority/hub scores (50 iterations), and generation depth (BFS from roots) computed across advisor, review, and citation network layers for literature-filtered core scholars.

Tools: Python 3.13 (stdlib only, no numpy/scipy), SQLite, D3.js v7, HTML5 Canvas. All visualizations are self-contained HTML files. NLP classification via Anthropic Claude API (Sonnet 4.5).

Japanese Literary Studies Digital Humanities Project

Data: ProQuest, Academic Family Tree (CC-BY 3.0), JSTOR, OpenAlex