Common Features of Well-known Empirical Laws in Scientometrics: Data Grouping Based on Frequency Distribution

Document Type : Research َ Article

Authors

Abstract

Purpose: To study four well-known empirical laws in Bibliometrics and Scientometrics (Lutka, Bradford, Zipf and Pareto) and group their data based on frequency distribution.
Methodology: It is an applied and descriptive research which analyzes documents. Data is categorized based on frequency distribution through studying data related to four scientometrics laws.
Results: Data in four groups contain the object rank, the studied feature and the frequency of the feature in that object. In each law, data are classified into different groups: Pareto (2 groups), Bradford (3 groups), Lutka (equal to the maximum frequency), and Zipf (equal to text words).
Conclusion: Grouping of objects and the number of objects inside each group follow a certain distribution. These empirical laws can be used to group different objects based on plenty of features.

Keywords


Ausloos, M. (2014). Zipf–Mandelbrot–Pareto model for co-authorship popularity. Scientometrics, 101(3), 1565-1586.
Axtell, R. L. (2001). Zipf distribution of US firm sizes. Science, 293 (5536), 1818-1820.
Bradford, S. C. (1934). Sources of information on specific subjects. Engineering: an Illustrated Weekly Journal (London), 137 (3550), 85–86.
Bradford, S. C. (1985) Sources of information on specific subjects. Journal of Information Science, 10) 4(, 173–180
Broadus, R. N. (1987). Toward a definition of ‘bibliometrics’. Scientometrics, 12 (5-6), 373–379.
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51 (4), 661-703.
Drott, M. C. (1981). Bradford's Law: Theory, empiricism and the gaps between. Library Trends, 30 (1), 41-52.
Franceschet, M. (2008). Frozen footprints. arXiv preprint arXiv:0811.4603. Retrieved Sep. 05, 2015, from http://arxiv.org/abs/0811.4603
Gelbukh, A., & Sidorov, G. (2001). Zipf and Heaps Laws’ coefficients depend on language. Computational Linguistics and Intelligent Text Processing. Berlin Heidelberg: Springer.
Harremoës, P., & Topsoe, F. (2005). Zipf's law, hyperbolic distributions and entropy loss. Electronic Notes in Discrete Mathematics21 (1), 315-318.
Hertzel, D. H. (1987). History of the development of ideas in bibliometrics. Kent, َA. َ(Ed.), Encyclopedia of library and information sciences(Vol. 42, pp. 144–219). New York: Marcel Dekker.
Hood, W. W., & Wilson, C. S. (2001). The literature of bibliometrics, scientometrics, and informetrics. Scientometrics, 52 (2), 291-314.
Leimkuhler, F. F. (1967). The Bradford distribution. Journal of documentation, 23 (3), 197-207.
Lotka, A. J. (1926). The frequency distribution of scientific productivity. Journal of the WashingtonAcademy of Sciences, 16 (12), 317–323.
Malacarne, L. C., Mendes, R. S., & Lenzi, E. K. (2002). Q-exponential distribution in urban agglomeration. Physical Review, 65 (1), 17-26.
Mayr, P. (2013). Relevance distributions across Bradford Zones: Can Bradfordizing improve search? Retrieved Nov. 14, 2015, from http://arxiv.org/abs/1305.0357
Milojević, S. (2010). Power law distributions in information science: Making the case for logarithmic binning. Journal of the American Society for Information Science and Technology, 61 (12), 2417-2425.
Mitzenmacher, M. (2004). A brief history of generative models for power law and lognormal distributions. Internet Mathematics, 1 (2), 226-251.
Newman, M. E. (2005). Power laws, Pareto distributions and Zipf's law. Contemporary Physics, 46 (5), 323-351.
O'Connor, D. O., & Voos, H. (1981). Empirical Laws, Theory Construction and Bibliometrics. Library Trends, 30 (1), 9-20.
Pareto, V. (1964). Cours d'Économie Politique: Nouvelle édition par G.-H. Bousquet et G. Busino. Geneve: Librairie Droz.
Pritchard, A. (1969). Statistical bibliography or bibliometrics? Journal of Documentation, 25 (4), 348-349.
Van Raan, A. F. (2001). Two-step competition process leads to quasi power-law income distributions: Application to scientific publication and citation distributions. Physica A: Statistical Mechanics and its Applications, 298 (3), 530-536.
Wilson, C. S. (1999). Informetrics. Annual Review of Information Science and Technology (ARIST), 34 (1), 107-247.
 Zipf, G. K. (1932). Selected studies of the principle of relative frequency in language. Cambridge (Mass): Harvard University Press.
CAPTCHA Image