Prévia do material em texto
Advances in Geographical and Environmental Sciences Swapan Kumar Maity Essential Graphical Techniques in Geography Advances in Geographical and Environmental Sciences Series Editor R. B. Singh, University of Delhi, Delhi, India Advances in Geographical and Environmental Sciences synthesizes series diagnostigation and prognostication of earth environment, incorporating challeng- ing interactive areas within ecological envelope of geosphere, biosphere, hydro- sphere, atmosphere and cryosphere. It deals with land use land cover change (LUCC), urbanization, energy flux, land-ocean fluxes, climate, food security, ecohydrology, biodiversity, natural hazards and disasters, human health and their mutual interaction and feedback mechanism in order to contribute towards sustainable future. The geosciences methods range from traditional field techniques and conventional data collection, use of remote sensing and geographical information system, computer aided technique to advance geostatistical and dynamic modeling. The series integrate past, present and future of geospheric attributes incorpo- rating biophysical and human dimensions in spatio-temporal perspectives. The geosciences, encompassing land-ocean-atmosphere interaction is considered as a vital component in the context of environmental issues, especially in observation and prediction of air and water pollution, global warming and urban heat islands. It is important to communicate the advances in geosciences to increase resilience of society through capacity building for mitigating the impact of natural hazards and disasters. Sustainability of human society depends strongly on the earth environ- ment, and thus the development of geosciences is critical for a better understanding of our living environment, and its sustainable development. Geoscience also has the responsibility to not confine itself to addressing current problems but it is also developing a framework to address future issues. In order to build a ‘Future Earth Model’ for understanding and predicting the functioning of the whole climatic system, collaboration of experts in the traditional earth disciplines as well as in ecology, information technology, instrumentation and complex system is essential, through initiatives from human geoscientists. Thus human geosceince is emerging as key policy science for contributing towards sustainability/survivality science together with future earth initiative. Advances in Geographical and Environmental Sciences series publishes books that contain novel approaches in tackling issues of human geoscience in its broadest sense — books in the series should focus on true progress in a particular area or region. The series includes monographs and edited volumes without any limitations in the page numbers. More information about this series at https://link.springer.com/bookseries/13113 https://link.springer.com/bookseries/13113 Swapan Kumar Maity Essential Graphical Techniques in Geography Swapan Kumar Maity Department of Geography Nayagram P.R.M. Government College Jhargram, West Bengal, India ISSN 2198-3542 ISSN 2198-3550 (electronic) Advances in Geographical and Environmental Sciences ISBN 978-981-16-6584-4 ISBN 978-981-16-6585-1 (eBook) https://doi.org/10.1007/978-981-16-6585-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore https://doi.org/10.1007/978-981-16-6585-1 Dedicated to my Parents Preface Geography is a scientific discipline that emphasizes how and why different geographic features vary from one place to another and how spatial patterns of these features change with time. Geographers always concentrate on the explanation of how physical and cultural features are distributed on the earth surface and what kinds of factors and processes are responsible for their spatial and temporal varia- tions. Geographical data need appropriate, systematic and logical presentation for a better understanding of their cartographic characteristics. Suitable, accurate and lucid demonstration and visualization of geographical data become helpful for their correct analysis, explanation and realization. Therefore, various types of primary and secondary data are used voluminously to explain and analyze the spatial distributions and variations of different geographical events and phenomena. Graphs, diagrams and maps are three unique and distinctive techniques of visu- alization of geographical data. In narrow sense, graphical representation means the depiction of data using various types of graphs but in a wider sense, all types of graphs, diagrams and mapping techniques are included in graphical methods of portraying the data. Graphical representation of various kinds of geographical data is very simple, attractive and easily understandable not only to the geographers or efficient academicians but also to the common literate people. It is the key for geog- raphers and researchers to recognize the nature of data, the pattern of spatial and temporal variations and their relationships and the formulation of principles to accu- rately understand and analyze features on or near the earth’s surface. These modes of representation also enable the development of spatial understanding and the capacity for technical and logical decision making. In this book, attempts have been made to analyze and explain different kinds of graphs, diagrams and mapping techniques, which are extensively used for the visual representation of various types of geographical data. The book has broadly been divided into four main chapters. Chapter 1 emphasizes the discussion of the concept and types of geographical data, major differences between them, sources of each type of data, methods of their collection, classification and processing of the collected data with special emphasis on frequency distribution table, methods and appropriateness of representation of data and advantages and disadvantages of using these methods. vii viii Preface It includes the discussion of the concept of attribute and variable, types of variables and differences between them. It also explains different types of measurement scales used in geographical analysis. Chapter 2 includes the detailed classification of all types of graphs and types of co-ordinate systems with illustrations as an essential basis of construction of graphs. Different types of Bi-axial (Arithmetic andLogarithmic graph, Climograph etc.), Tri- axial (Ternary graph), Multi-axial (Spider graph, Polar graph etc.) and specialgraphs (Water budget graph, Hydrograph, Rating curve, Lorenz curve, Rank-size graph, Hypsometric curve etc.) have been discussed with suitable examples in terms of their appropriate data structure, necessary numerical calculations, methods of construc- tion, proper illustrations and advantages and disadvantages of their use. Concept of arithmetic and logarithmic graphs has been explained precisely with pertinent exam- ples and illustrations. Different types of frequency distribution graphs have been explained with suitable data, necessary mathematical and statistical computations and proper illustrations. Chapter 3 focuses on the detailed discussion of various types of diagrams clas- sified on a different basis. All types of one-dimensional (Bar, Pyramid etc.), two- dimensional (Triangular, Square, Circular etc.), three-dimensional (Cube, Sphere etc.) and other diagrams (Pictograms and Kite diagram) have been discussed with suitable examples in terms of their appropriate data structure, necessary numerical (geometrical) calculations, methods of construction, appropriate illustrations and advantages and disadvantages of their use. Chapter 4 explains the basic Cartographic terminologies like Geodesy, Geoid, Spheroid, Datum, Geographic co-ordinate system, Surveying and levelling, Traversing, Bearing, Magnetic declination and inclination etc in a lucid manner with suitable illustrations. It includes the detailed classification and discussion of all types of maps based on their scale and purposes (contents) of preparing the map with special emphasis on Indian Topographical Sheets. All pictorial and mathemat- ical methods of representation of relief have been explained in detail with suitable examples and illustrations. Various types of distributional thematic maps have been analyzed with suitable examples emphasizing their suitable data structure, neces- sary numerical calculations, methods and principles of their construction, proper illustrations and advantages and disadvantages of their use. It also explains different techniques of measurement of direction, distance and area on maps. The methods of construction of all types of graphs, diagrams and maps are explained step-by-step in a systematic way for easy and quick understanding of the readers. The book is unique of its kind as it reflects an accurate co-relation between the theoretical knowledge of various geographical events and phenomena and their realistic implications with suitable examples using proper graphical techniques. The book will be helpful for the students, researchers, cartographers and decision-makers in representing and analyzing various geographical data for a better, systematic and scientific understanding of the real world. Midnapore, West Bengal, India Swapan Kumar Maity Acknowledgements It gives me immense pleasure to express my deep gratitude to all those who contributed in their own ways for the successful completion of this book. I am heartily thankful to each soul that has come across all through the journey. I owe my thankfulness to my students Rajesh Bag, Gopal Shee, Baneswar Adak, Suvajit Barman (SACT, K. D. College of Commerce and General Studies), Krishnapriyo Das and Arpita Routh for their help and support in preparing this book. I would like to express my gratitude to Dr. Samit Maiti, Assistant Professor of English, Seva Bharati Mahavidyalaya and Dr. Soumitra Chakraborty, Assis- tant Professor of English, Mallabhum Institute of Technology for their academic support and advice. I am really thankful to Mr. Titas Aikat, GIS Manager, Horizen, Naihati and Mrs. Somrita Sinha, SACT, Department of Geography, Raja N. L. Khan Women’s College for their technical and academic support for preparing this book. I am also thankful to Dr. Netai Chandra Das, Officer-in-Charge and Assistant Professor of Philosophy, Nayagram P. R. M. Government College for his continuous encouragement and valuable suggestions. I would like to express my heartfelt gratitude to Dr. Ramkrishna Maiti, Professor, Department of Geography and Environment Management, Vidyasagar University for his encouragement, support and advice. His valuable suggestions at the time of preparing this book helped me in bringing it to the final shape. I convey a lot of thanks to all my family members, especially to my wife Sonali for her encouragement, co-operation and continuous moral and emotional supports. I am very much thankful to my little son Souparno for his co-operation, which has given me sufficient time for the completion of this book. Midnapore, West Bengal, India Swapan Kumar Maity ix About This Book Representation of geographical data using graphs, diagrams andmapping techniques is a key for geographers and for researchers in other disciplines to explore the nature of data, the pattern of spatial and temporal variations and their relationships and formulation of principles to accurately understand and analyze features on or near the earth’s surface. These modes of representation also enable the development of spatial understanding and the capacity for technical and logical decision-making. The book depicts all types of graphs, diagrams and maps, explained in detail with numerous examples. The emphasis is on their appropriate data structure, the rele- vance of selecting the correct technique, methods of their construction, advantages and disadvantages of their use and applications of these techniques in analyzing and realizing the spatial pattern of various geographical features and phenomena. This book is unique in that it reflects an accurate correlation between theoretical knowledge of geographical events and phenomena and their realistic implications, with relevant examples using appropriate graphical methods. The book serves as a valuable resource for students, researchers, cartographers and decision-makers to analyze and represent various geographical data for a better, systematic and scientific understanding of the real world. xi Contents 1 Concept, Types, Collection, Classification and Representation of Geographical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Concept of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Concept of Geographical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Types of Data (Geographical Data) . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4.1 Qualitative Data (Attribute) . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4.2 Quantitative Data (Variable) . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4.3 Uni-Variate Data and Bi-Variate Data . . . . . . . . . . . . . . . . . 5 1.4.4 Difference Between Uni-Variate Data and Bi-Variate Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4.5 Independent Variable and Dependent Variable . . . . . . . . . . 7 1.4.6 Difference Between Qualitative Data (Attribute) and Quantitative Data (Variable) . . . . . . . . . . . . . . . . . . . . . 7 1.4.7 Primary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.8 Secondary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4.9 Advantages of Use of Primary Data Over the Secondary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4.10 Difference Between Primary and Secondary Data . . . . . . . 9 1.5 Methods of Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5.1 Methods of Primary Data Collection . . . . . . . . . . . . . . . . . . 10 1.5.2 Methods of Secondary Data Collection . . . . . . . . . . . . . . . . 17 1.6 Measurement Scales in Geographical System . . . . . . . . . . . . . . . . . . 19 1.6.1 Nominal Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.6.2 Ordinal Scale . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.6.3 Interval Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.6.4 Ratio Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.7 Processing of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.7.1 Classification of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.7.2 Tabulation of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.7.3 Frequency Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 xiii xiv Contents 1.8 Methods of Presentation of Geographical Data . . . . . . . . . . . . . . . . . 42 1.8.1 Textual Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 1.8.2 Tabular Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 1.8.3 Semi-Tabular Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 1.8.4 Graphical Form (Graphs, Diagrams and Maps) . . . . . . . . . 45 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2 Representation of Geographical Data Using Graphs . . . . . . . . . . . . . . . 47 2.1 Concept of Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.2 Types of Co-ordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.2.1 Cartesian or Rectangular Co-ordinate System . . . . . . . . . . 48 2.2.2 Polar Co-ordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.2.3 Cylindrical Co-ordinate System . . . . . . . . . . . . . . . . . . . . . . 52 2.2.4 Spherical Co-ordinate System . . . . . . . . . . . . . . . . . . . . . . . 54 2.3 Selection of Scale in Constructing a Graph . . . . . . . . . . . . . . . . . . . . 55 2.4 Advantages and Disadvantages of the Use of Graphs . . . . . . . . . . . 55 2.5 Types of Graphical Representation of Data . . . . . . . . . . . . . . . . . . . . 56 2.5.1 Bi-axial Graphs or Line Graphs or Historigram . . . . . . . . . 56 2.5.2 Tri-axial Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.5.3 Multi-axial Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2.5.4 Special Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2.5.5 Frequency Distribution Graphs . . . . . . . . . . . . . . . . . . . . . . . 132 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 3 Diagrammatic Representation of Geographical Data . . . . . . . . . . . . . . 153 3.1 Concept of Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 3.2 Advantages and Disadvantages of Data Representation in Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 3.3 Difference Between Graph and Diagram . . . . . . . . . . . . . . . . . . . . . . 154 3.4 Types of Diagrams in Data Representation . . . . . . . . . . . . . . . . . . . . 155 3.4.1 One-Dimensional Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . 155 3.4.2 Two-Dimensional Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . 168 3.4.3 Three-Dimensional Diagrams . . . . . . . . . . . . . . . . . . . . . . . . 182 3.4.4 Other Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 4 Mapping Techniques of Geographical Data . . . . . . . . . . . . . . . . . . . . . . . 193 4.1 Concept and Definition of Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 4.2 Concept of Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 4.3 Difference Between Plan and Map . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 4.4 Elements of a Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 4.5 History of Map-Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 4.5.1 Ancient Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 4.5.2 Mediaeval Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 4.5.3 Modern Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 4.5.4 Contributions of Indian Scholars . . . . . . . . . . . . . . . . . . . . . 200 Contents xv 4.6 Methods of Mapping the Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 4.7 Cartography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 4.8 Key Concepts of Cartography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 4.8.1 Geodesy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 4.8.2 Geoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 4.8.3 Ellipsoid or Spheroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 4.8.4 Surveying and Levelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 4.8.5 Geodetic Surveying and Plane Surveying . . . . . . . . . . . . . . 209 4.8.6 Datum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 4.8.7 Reduced Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 4.8.8 Geographic Co-ordinate System . . . . . . . . . . . . . . . . . . . . . 212 4.8.9 Cardinal Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 4.8.10 Map Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 4.8.11 Bearing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 4.8.12 Magnetic Declination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 4.8.13 Magnetic Inclination or Magnetic Dip . . . . . . . . . . . . . . . . 223 4.8.14 Traversing or Traverse Survey . . . . . . . . . . . . . . . . . . . . . . . 224 4.8.15 Triangulation Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 4.8.16 Trilateration Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 4.8.17 Difference Between Triangulation and Trilateration Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 4.9 Types of Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 4.9.1 General Reference Maps (General Purpose Maps) . . . . . . 229 4.9.2 Thematic Maps (Special Purpose Maps) . . . . . . . . . . . . . . . 230 4.9.3 Types of Thematic Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 4.10 Types of Maps Based on Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 4.10.1 Large-Scale Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 4.10.2 Small-Scale Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 4.10.3 Medium-Scale Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 4.11 Based on the Purpose or Content or Function of the Map . . . . . . . . 238 4.11.1 Physical or Natural Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 4.11.2 Cultural Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 4.12 Techniques for the Study of Spatial Patterns of Distribution of Elements (Distribution Map) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 4.12.1 Chorochromatic Map (Colour or Tint Method) . . . . . . . . . 252 4.12.2 Choroschematic or Symbol Map . . . . . . . . . . . . . . . . . . . . . 255 4.12.3 Choropleth Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 258 4.12.4 Dasymetric Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 4.12.5 Isarithmic Map (Isometric Map and Isopleth Map) . . . . . . 266 4.12.6 Dot Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 4.12.7 Flow Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 4.12.8 Diagrammatic Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 4.13 Importance and Uses of Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 4.13.1 Measurement of Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 4.13.2 Measurement of Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 xvi Contents 4.13.3 Measurement of Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 About the Author Dr. Swapan KumarMaity is an assistant professor of geography at Nayagram P. R. M. Government College, Jhargram, West Bengal, India. He completed his doctoral degree at Vidyasagar University with his dissertation titled Mechanisms of sedimen- tation in the lower reach of the Rupnarayan River. Dr. Maity has 7 years of teaching experience at the undergraduate level in geography and 2 years at the postgraduate level in geography and environmental management. His teaching interests include geotectonic, geomorphology, climatology and practical geography, including remote sensing and GIS. His main research areas include fluvial geomorphology, river sedi- mentation and sediment mineralogy. He has published several research articles in renowned journals and two books from Springer in the field of the mechanism and environment of river sedimentation. Dr. Maity is a life member of the Indian Institute of Geomorphologists. xvii Abbreviations ADCP Acoustic Doppler Current Profiler AE Actual evapotranspiration BB Backward Bearing CI Cropping Intensity DMS Degrees, Minutes, and Seconds DSMs Defence Series Maps DST Department of Science and Technology EGM 96 Earth Gravitational Model 1996 EI Erosional integral FAO Food and Agricultural Organization FB Forward Bearing GCA Gross Cropped Area GCS Geographic Co-ordinate System GPS Global Positioning System GRS-80 Geodetic Reference System 1980 GSI Geological Survey of India GTS Great Trigonometrical Survey HI Hypsometric integral IMF International Monitory Fund IQR Inter-quartile Range ISI Indian Statistical Institute MSL Mean Sea Level NATMO National Atlas and Thematic Mapping Organization NCA Net Cropped Area NMP National Map Policy NRSA National Remote Sensing Agency OSMs Open Series Maps PE Potential evapotranspiration PWD Public Works Department QB Quadrantal Bearing RB Reduced Bearing xix xx Abbreviations RL Reduced Level SLR Satellite Laser Ranging SOI Survey of India UNO United Nations Organization USDA United States Department of Agriculture UTM Universal Transverse Mercator VLBI Very Long Baseline Interferometry WCB Whole Circle Bearing WGS 84 World Geodetic System 1984 Symbols fi Class frequency xi Class mark wi Class width fdi Frequency density R f i Relative frequency N Total frequency F Cumulative frequency r Radial distance θ Azimuthal angle φ Polar angle or zenithal angle δ Latitude P Precipitation T Temperature R Soil moisture recharge U Utilization of water D Deficiency of water S Surplus of water Q Water discharge G Gini co-efficient Q1 Lower quartile Q2 Middle quartile Q3 Upper quartile Pr Population of r ranking city P1 Population of 1st ranking city hi Mid-value of the contour height H Maximum height of the basin ai Area between successive contours A Total basin area Sk Skewness β1 Skewness co-efficient μ3 Third central moment xxi xxii Symbols σ Population standard deviation μ1 First moment f (x) Probability density function μ Population mean β2 Kurtosis co-efficient li Length of side of equilateral triangle or square or cube ri Radius of the circle or sphere f Flattening e Eccentricity of the ellipse H Topographic height or Orthometric height h Spheroid or ellipsoid height N Geoid height I Magnetic inclination or magnetic dip Z Vertical component D Map distance S Map scale T Total number of full squares List of Figures Fig. 1.1 Qualitative classification of data (population) . . . . . . . . . . . . . . . . 27 Fig. 1.2 Different parts of an ideal table . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Fig. 2.1 Position of independent and dependent variables in different quadrants (Cartesian co-ordinate system) . . . . . . . . . 49 Fig. 2.2 Determination of location of a point on Cartesian co-ordinate system (3D) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Fig. 2.3 Determination of location of a point on polar co-ordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Fig. 2.4 Determination of location of a point on cylindrical co-ordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Fig. 2.5 Determination of location of a point on spherical co-ordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Fig. 2.6 Line graph (Historigram) showing the temporal changes of total population in Kolkata Urban Agglomeration (KUA) Source Census of India . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Fig. 2.7 Line graph or Historigram (Production of rice in India, 2000–2011) SourceDirectorate of Economics and Statistics (Government of India) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Fig. 2.8 a Arithmetic scale on both the axes, b Arithmetic scale on the ‘X’-axis but the logarithmic scale on the ‘Y ’-axis, c Arithmetic scale on the ‘Y ’-axis but the logarithmic scale on the ‘X’-axis, and d Logarithmic scale on both the axes . . . . . 60 Fig. 2.9 Arithmetic graph (Number of male and female deaths per year) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Fig. 2.10 Logarithmic graph (Number of male and female deaths per year) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Fig. 2.11 Poly graph showing total, male and female literacy rates . . . . . . 66 Fig. 2.12 Band graph showing the production of various crops in different years in India Source Directorate of Economics and Statistics, Ministry of Agriculture and Farmers Welfare . . . . 68 Fig. 2.13 USDA type of climograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 xxiii xxiv List of Figures Fig. 2.14 The base frame of Foster’s climograph . . . . . . . . . . . . . . . . . . . . . 70 Fig. 2.15 Climograph showing the wet-bulb temperature and relative humidity of Kolkata (after G. Taylor) . . . . . . . . . . . . . . . . . . . . . . 71 Fig. 2.16 Hythergraph showing the mean monthly temperature and rainfall of Burdwan district . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Fig. 2.17 Identification of position of points in ternary graph . . . . . . . . . . . 74 Fig. 2.18 Identification of sediment type using ternary graph . . . . . . . . . . . 76 Fig. 2.19 Radar graph (Production of different crops) . . . . . . . . . . . . . . . . . 77 Fig. 2.20 Wind rose graph showing the percentage of days wind blowing from different directions . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Fig. 2.21 Polar graph showing the number of corries facing towards different directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Fig. 2.22 Scatter graph (Relation between the distance from CBD and air temperature) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Fig. 2.23 Positive, negative and no co-relation. . . . . . . . . . . . . . . . . . . . . . . 84 Fig. 2.24 Perfect positive and negative co-relation . . . . . . . . . . . . . . . . . . . . 85 Fig. 2.25 Linear and nonlinear co-relation . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Fig. 2.26 Ergograph showing the relation between seasons, climatic elements and cropping patterns of Howrah, West Bengal . . . . . . 87 Fig. 2.27 Circular ergograph showing the rhythm of seasonal activities (after A. Geddes and G.G. Ogilvie 1938) . . . . . . . . . . . 90 Fig. 2.28 Ombrothermic graph of Purulia district, West Bengal . . . . . . . . . 91 Fig. 2.29 Water balance curve of a sample study area . . . . . . . . . . . . . . . . . 96 Fig. 2.30 Elements of a hydrograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Fig. 2.31 Various components of run-off (after Singh 1994) . . . . . . . . . . . . 100 Fig. 2.32 Important components of streamflow hydrograph . . . . . . . . . . . . 101 Fig. 2.33 Rating curve (Relationship between stream stage and discharge) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Fig. 2.34 Lorenz curve showing the inequality in the distribution of number and area of land holdings . . . . . . . . . . . . . . . . . . . . . . . 109 Fig. 2.35 Lorenz curve showing the inequality in the distribution of total and urban population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Fig. 2.36 Lorenz curve showing the inequality of income distribution of people in Sweden, USA and India Sources Statistics Sweden, online database (2014), U.S. Census Bureau, Historical Income Tables (2016); Credit Suisse’s Global Wealth Databook (2014). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Fig. 2.37 Rainfall dispersion graph of Bankura district (year 1976– 2015) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Fig. 2.38 Rank-size graph according to G.K. Zipf (arithmetic scale) . . . . . 118 Fig. 2.39 Rank-size graph according to Pareto (logarithmic scale) . . . . . . . 118 Fig. 2.40 Deviations in rank-size distribution . . . . . . . . . . . . . . . . . . . . . . . . 123 Fig. 2.41 Box-and-whisker graph without outliers . . . . . . . . . . . . . . . . . . . . 124 Fig. 2.42 Box-and-whisker graph with outliers . . . . . . . . . . . . . . . . . . . . . . . 124 Fig. 2.43 Hypsometric curve for the whole earth . . . . . . . . . . . . . . . . . . . . . 128 List of Figures xxv Fig. 2.44 Sample drainage basin showing height and area . . . . . . . . . . . . . . 129 Fig. 2.45 Area–height relationship of the given drainage basin . . . . . . . . . . 129 Fig. 2.46 Hypsometric curve of the given drainage basin . . . . . . . . . . . . . . 130 Fig. 2.47 Understanding the stages of landform development using hypsometric curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Fig. 2.48 Histogram (average concentration of SPM in the air) . . . . . . . . . 134 Fig. 2.49 Histogram (monthly income of families) . . . . . . . . . . . . . . . . . . . 135 Fig. 2.50 Frequency polygon showing the average concentration of SPM in the air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Fig. 2.51 Frequency polygon showing the monthly income of families . . . 137 Fig. 2.52 Histogramwith polygon showing the average concentration of SPM (mg/m3) in the air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Fig. 2.53 Histogram with polygon showing the monthly income of families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Fig. 2.54 Frequency polygon of discrete variable (Distribution of landslide occurrences) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Fig. 2.55 Frequency curve showing the average concentration of SPM (mg/m3) in the air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Fig. 2.56 Frequency curve showing the monthly income of families . . . . . 141 Fig. 2.57 Types of frequency curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Fig. 2.58 Positive, negative and zero or no skewness . . . . . . . . . . . . . . . . . . 142 Fig. 2.59 Area under a standard normal curve . . . . . . . . . . . . . . . . . . . . . . . 145 Fig. 2.60 Degree of peakedness (Kurtosis) of frequency curve . . . . . . . . . . 149 Fig. 2.61 Cumulative frequency curve (Ogive) showing the average concentration of SPM (mg/m3) in air . . . . . . . . . . . . . . . . . . . . . . . 151 Fig. 2.62 Cumulative frequency curve (Ogive) showing the monthly income of families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Fig. 3.1 Vertical simple bar (Temporal change of urban population in India since independence) Source Census of India, 2011 . . . . 157 Fig. 3.2 Horizontal simple bar (Total population in selected states in India) Source Census of India, 2011 . . . . . . . . . . . . . . . . . . . . . 158 Fig. 3.3 Multiple bars showing the continent-wise urban population (%) in 2000 and 2025*Source UN Population Division, 2009–2010 and The World Guide, 12th ed. * Projected figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Fig. 3.4 Sub-divided bar (Production of different crops in selected years in India) Source Ministry of Agriculture and Economic Survey, 2010–2011 and Husain, 2014 . . . . . . . . . . 161 Fig. 3.5 Percentage bar showing the proportion of population in different age groups in selected states in India Source Census of India, 2011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Fig. 3.6 a Absolute population pyramid and b percentage population pyramid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Fig. 3.7 Ecological pyramid (Pyramid of numbers) . . . . . . . . . . . . . . . . . . 166 xxvi List of Figures Fig. 3.8 Urban pyramid showing the percentage of towns in different size classes in India . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Fig. 3.9 Rectangular diagram showing the area of irrigated land (hectares) by different sources in India . . . . . . . . . . . . . . . . . . . . . 168 Fig. 3.10 Rectangular diagram showing the area of irrigated land (%) by different sources in India . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Fig. 3.11 Triangular diagram (Geographical area of selected biosphere reserves in India) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Fig. 3.12 Square diagram (Population of selected million cities of India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Fig. 3.13 Simple circular diagram (Cropping pattern in India, 2010– 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Fig. 3.14 Pie diagram (Consumption of different fertilizers in India) . . . . . 175 Fig. 3.15 Percentage pie diagram showing the consumption of different fertilizers in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Fig. 3.16 Doughnut diagram (Area under different land uses in selected districts of West Bengal) . . . . . . . . . . . . . . . . . . . . . . . 180 Fig. 3.17 Steps of construction of cube diagram . . . . . . . . . . . . . . . . . . . . . . 183 Fig. 3.18 Cube diagram (Population of main seven tribes in India) . . . . . . 184 Fig. 3.19 Sphere diagram (Urban population of selected states in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Fig. 3.20 Kite diagram showing the number of vegetation species along the sand dune transect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Fig. 4.1 Plan of a college campus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Fig. 4.2 Elements of a map (Source Sediment yield in global rivers, Milliman and Meade 1983) . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . 196 Fig. 4.3 World map of Ptolemy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Fig. 4.4 Location and extent of Dwipic world as conceived in Ancient India [PURANAS] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Fig. 4.5 Relation between three surfaces of the earth . . . . . . . . . . . . . . . . . 205 Fig. 4.6 Geoid, sphere and ellipsoid (Source http://physics.nmsu. edu/~jni/introgeophys/05_sea_surface_and_geoid/index. html) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Fig. 4.7 Geoid and ellipsoid in the whole earth (Model of the earth’s shape) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Fig. 4.8 Elements of an ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Fig. 4.9 Consideration of curvature of the earth in geodetic surveying (after Kanetkar and Kulkarni 1984) . . . . . . . . . . . . . . . 209 Fig. 4.10 Equipotential surfaces as vertical datum [Relation between orthometric height (H), ellipsoid or spheroid height (h) and geoid height (N)] . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Fig. 4.11 Concept of datum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Fig. 4.12 Reduced level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Fig. 4.13 Geographic co-ordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Fig. 4.14 Cardinal points of the compass . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 List of Figures xxvii Fig. 4.15 Concept of bearing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Fig. 4.16 True and magnetic meridian and true and magnetic bearing . . . . 217 Fig. 4.17 Whole circle bearing (a) and quadrantal bearing (b) . . . . . . . . . . 219 Fig. 4.18 Forward and backward bearing . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Fig. 4.19 Magnetic declination (East and west magnetic declination) . . . . 221 Fig. 4.20 Magnetic inclination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Fig. 4.21 Open traverse (a) and closed traverse (b) . . . . . . . . . . . . . . . . . . . 224 Fig. 4.22 Locating a point by angular measurement (a) and triangulation network (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Fig. 4.23 Locating a point by linear measurement (a) and trilateration network (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Fig. 4.24 Layout, dimension and scale of million sheets and Indian topographical maps (Old series) . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 Fig. 4.25 Layout, dimension and scale of Indian topographical maps (Open series) (National Map Policy-2005, Projection-UTM, Datum-WGS-84) . . . . . . . . . . . . . . . . . . . . . . . . 236 Fig. 4.26 Hachure lines or contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Fig. 4.27 Relief shading or hill shading map . . . . . . . . . . . . . . . . . . . . . . . . 241 Fig. 4.28 Relation between contour spacing and steepness of slope . . . . . . 242 Fig. 4.29 Contour patterns of different relief features . . . . . . . . . . . . . . . . . 243 Fig. 4.30 Relation between contour pattern and topographic expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Fig. 4.31 Trigonometrical station, benchmark and spot height . . . . . . . . . . 247 Fig. 4.32 a Bhola GTS tower near Singur and b Semaphore Tower, Parbatichak, Arambagh, Hooghly, West Bengal, India . . . . . . . . . 248 Fig. 4.33 Form lines between contours to show minor topographic details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Fig. 4.34 Simple chorochromatic map showing the spatial distribution of forest-covered areas in West Bengal (Source NATMO MAPS, DST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Fig. 4.35 Compound chorochromatic map showing the general land use pattern of Denan village, Purba Medinipur, West Bengal (Source Field survey) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 Fig. 4.36 Choroschematic map (Distribution of mineral and energy resources in West Bengal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Fig. 4.37 a Population density map of West Bengal (2011), b Cropping intensity map of West Bengal (2018–2019) . . . . . . . . . 262 Fig. 4.38 Visual difference between the Choropleth map (a) and the Dasymetric map (b) (Dasymetric map shows the exclusion areas of zero population) . . . . . . . . . . . . . . . . . . . . . 264 Fig. 4.39 Procedures of drawing of Isopleth map (Isotherms in this sample area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Fig. 4.40 Rainfall zones of West Bengal in Isopleth map (Source NATMO MAPS, DST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 xxviii List of Figures Fig. 4.41 Dot map showing the distribution of rural population inWest Bengal (Source Primary census abstract and district census report, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 Fig. 4.42 Flow map showing the movement of local trains between selected stations in West Bengal . . . . . . . . . . . . . . . . . . . 279 Fig. 4.43 Flow map showing the discharge of water in tributary rivers and main river (River ‘I’) . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Fig. 4.44 Bar diagrammatic map showing the district-wise rural population in West Bengal (Census 2011) . . . . . . . . . . . . . . . . . . . 283 Fig. 4.45 Circular diagrammatic map showing the gross cropped area of different districts of West Bengal (2018–2019) . . . . . . . . 284 Fig. 4.46 Measurement of direction on map . . . . . . . . . . . . . . . . . . . . . . . . . 288 Fig. 4.47 Measurement of distance of curved features on map using straight-line segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Fig. 4.48 Measurement of distance of curved features on map using toned thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Fig. 4.49 An Opisometer (a) and the technique of measurement of distance of curved features on map using Opisometer (b) . . . . 291 Fig. 4.50 Measurement of area on map by Strips method . . . . . . . . . . . . . . 293 Fig. 4.51 Measurement of area on map by square grid method . . . . . . . . . . 295 Fig. 4.52 Measurement of area by dividing into regular geometric shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Fig. 4.53 Measurement of area having irregular boundary using geometric method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Fig. 4.54 Principles of measurement of area using Simpson’s method . . . . 302 Fig. 4.55 Planimeter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 List of Tables Table 1.1 Bi-variate data showing depth below ground (m) and air temperature (°C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Table 1.2 Nominal data (Number of male and female students in different departments) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Table 1.3 Ordinal data (Literacy rate of few Indian states, 2011) . . . . . . . 21 Table 1.4 Moh’s scale of hardness of minerals (Ordinal scale) . . . . . . . . . 22 Table 1.5 Characteristics of different scales of measurement . . . . . . . . . . 24 Table 1.6 Geographical classification of data (Population densities of some states in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Table 1.7 Chronological classification of data (Decadal growthrate of population in India, 1901–2011) . . . . . . . . . . . . . . . . . . . . . . . 26 Table 1.8 Quantitative classification of data (Monthly income of a group of people) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Table 1.9 Blank table to show season-wise water discharge in Rupnarayan River . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Table 1.10 Simple table (Population size of some selected countries, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Table 1.11 Complex table (Hypothetical state of the Earth’s atmosphere) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Table 1.12 Simple frequency distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Table 1.13 Grouped frequency distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Table 1.14 Frequency distribution table (Based on the data from Table 1.13) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Table 1.15 Exclusive and inclusive methods of selection of class limit . . . 39 Table 1.16 Frequency distribution table showing the height (in metre) from mean sea level . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Table 1.17 Frequency distribution table showing the mean monthly temperature (°F) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Table 1.18 Cumulative frequency distribution table using the data of Table 1.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 xxix xxx List of Tables Table 1.19 Cumulative frequency distribution table using the data of Table 1.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Table 1.20 Tabular presentation of data (% of sand, silt and clay in bed sediments of Rupnarayan River) . . . . . . . . . . . . . . . . . . . 44 Table 2.1 Types of graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Table 2.2 Data for line graph or historigram (Temporal change of total population in Kolkata Urban Agglomeration) . . . . . . . . 59 Table 2.3 Data for line graph or historigram (Production of rice in India, 2000–2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Table 2.4 Database for arithmetic and logarithmic line graph (Age and sex-specific variation of death rates) . . . . . . . . . . . . . . . . . . 62 Table 2.5 Worksheet for poly graph (Total, male and female literacy rates in different census years in India) . . . . . . . . . . . . . . . . . . . . 66 Table 2.6 Worksheet for band graph (Production of different crops in India) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Table 2.7 Monthly wet-bulb temperature (°F) and relative humidity (%) of Kolkata, West Bengal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Table 2.8 Mean monthly temperature and rainfall of Burdwan district, West Bengal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Table 2.9 Database for ternary graph (Proportion of sand–silt-clay in sediments) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Table 2.10 Data for radar graph (Production of different crops in different years) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Table 2.11 Percentage of days wind blowing from different directions . . . 82 Table 2.12 Data for polar graph (The orientation of corries in a glacial region) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Table 2.13 Database for scatter graph (The distributions of air temperature in the month of April around an urban area) . . . . . 83 Table 2.14 Data for ergograph (Monthly temperature, relative humidity and rainfall of Howrah, West Bengal) . . . . . . . . . . . . . 86 Table 2.15 Data for ergograph (Net acreage of different crops and their growing seasons of Howrah, West Bengal) . . . . . . . . . 88 Table 2.16 Database for circular ergograph (Rhythmic seasonal activities) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Table 2.17 Data for ombrothermic graph (Average temperature and rainfall of Purulia district, West Bengal) . . . . . . . . . . . . . . . 90 Table 2.18 Water need and supply (mm) of a region (field capacity: 100 mm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Table 2.19 Water budget estimation for a sample study area (elevation: 12 m; field capacity: 102 mm) . . . . . . . . . . . . . . . . . . 95 Table 2.20 Stream stage and discharge relationship . . . . . . . . . . . . . . . . . . . 102 Table 2.21 Worksheet for Lorenz curve (The number and area of land holdings) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Table 2.22 Worksheet for Lorenz curve (Total and urban population of six North Bengal districts of West Bengal) . . . . . . . . . . . . . . 108 List of Tables xxxi Table 2.23 Inequality in the distribution of income of people of Sweden, USA and India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Table 2.24 Calculations for rainfall dispersion graph (Annual rainfall of Bankura district, year 1976–2015) . . . . . . . . . . . . . . . . . . . . . 115 Table 2.25 Rank-size relationship of Indian cities (according to G.K. Zipf method) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Table 2.26 Rank-size relationship of Indian cities (according to Pareto method) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Table 2.27 Expected populations and their deviations from actual populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Table 2.28 Calculations for area–height graph and hypsometric curve in a sample drainage basin (Fig. 2.44) . . . . . . . . . . . . . . . 128 Table 2.29 Grouped frequency distribution with equal class size (average concentration of SPM in the air) . . . . . . . . . . . . . . . . . 134 Table 2.30 Grouped frequency distribution with unequal class size (monthly income of families) . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Table 2.31 Methods of calculating Y in f (x) for constructing a normal curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Table 2.32 Standard normal distribution table . . . . . . . . . . . . . . . . . . . . . . . 147 Table 2.33 Worksheet for drawing Ogive (with equal class size) . . . . . . . . 150 Table 2.34 Worksheet for drawing Ogive (with unequal class size) . . . . . . 150 Table 3.1 Types of diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Table 3.2 Data for vertical simple bar diagram (Temporal changes of urban population in India) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Table 3.3 Data for horizontal simple bar diagram (Total population in selected states in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . 159 Table 3.4 Calculations for multiple bar diagram (Continent-wise urban population) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Table 3.5 Calculations for sub-divided bar diagram (Production of different crops in India, 1950–1951 to 2010–2011) . . . . . . . . 161 Table 3.6 Calculations for percentage bar diagram (Proportion of population in different age groups in selected states in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Table 3.7 Worksheet for age-sex pyramid (Based on the population of Purba Medinipur district, West Bengal, 2011) . . . . . . . . . . . . 165 Table 3.8 Database for urban pyramid (Size class distribution of towns in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 167 Table 3.9 Calculations for rectangular diagram (Area of irrigated land by different sources of irrigation in India) . . . . . . . . . . . . . 169 Table 3.10 Worksheet for triangular diagram (Geographical area of selected biosphere reserves in India) . . . . . . . . . . . . . . . . . . . 171 Table 3.11 Worksheet for square diagram (Population of selected million cities of India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Table 3.12 Worksheet for simple circular diagram (Cropping pattern in India, 2010–2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 xxxii List of Tables Table 3.13 Worksheet for pie-diagram (Consumption of fertilizers in India, lakh tonnes) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Table 3.14 Database for doughnut diagram (Area under different land uses in selected districts of West Bengal) . . . . . . . . . . . . . . 180 Table 3.15 Worksheet for cube diagram (Population of main seven tribes in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Table 3.16 Worksheet for sphere diagram (Urban population in selected states in India, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . 186 Table 3.17 Data for pictograms (Production of wheat in different years in India) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Table 3.18 Database for kite diagram (Number of vegetation species along the sand dune transects) . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Table 4.1 Relation between colatitude (polar angle) and latitude . . . . . . . 213 Table 4.2 Suitable projections for different maps . . . . . . . . . . . . . . . . . . . . 216 Table 4.3 Methods of conversion of Q. B. from W. C. B . . . . . . . . . . . . . . 220 Table 4.4 Types of maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Table 4.5 Layout, dimension and scale of million sheets and Indian topographical maps (Old series) . . . . . . . . . . . . . . . . . . . . . . . . . 235 Table 4.6 Layout, dimension and scale of open series maps (National Map Policy-2005, Projection-UTM, Datum-WGS84) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Table 4.7 Contour patterns of typical topographic features . . . . . . . . . . . . 244 Table 4.8 Worksheet for choropleth map (Population density of different districts of West Bengal, 2011 census) . . . . . . . . . . 261 Table 4.9 Category-wise population density in different districts in West Bengal (2011 census) . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Table 4.10 Worksheet for cropping intensity map of West Bengal (2018–2019) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Table 4.11 Category-wise cropping intensity in different districts in West Bengal (2018–2019) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Table 4.12 Calculations for dot map (Status of rural population in different districts of West Bengal, census 2011) . . . . . . . . . . 275 Table 4.13 Worksheet for flow map (Number of local trains connecting between selected stations in West Bengal, India) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Table 4.14 Worksheet for computing flow of water in tributary rivers and main river ‘I’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Table 4.15 Calculations for bar diagrammatic map (District-wise rural population in West Bengal, census 2011) . . . . . . . . . . . . . 285 Table 4.16 Calculations for circular diagrammatic map (Gross cropped area of different districts of West Bengal, 2018– 2019) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Table 4.17 Methods of measurement of area on map . . . . . . . . . . . . . . . . . . 292 Chapter 1 Concept, Types, Collection, Classification and Representation of Geographical Data Abstract Geography is a scientific discipline which emphasizes on the collection, processing, suitable representation and logical and scientific interpretation of various types of primary and secondary data for better understanding and explanation of the spatial distributions and variations of different geographical features and phenomena on or near the surface of the earth. This chapter focuses on the concept and types of data used in geographical analysis, sources of each type of data, methods of their collection as well as the advantages and disadvantages of their use. Major differ- ences between various types of data are discussed clearly with suitable examples. It includes the detailed discussion of the concept of attribute and variable, types of variables and differences between them. Different types of measurement scales used in geographical analysis, their characteristics and application in geographical study have been explained with numerous examples. Techniques of classification, tabula- tion and processing of the collected data on different basis (i.e. based on location, time etc.) are discussed properly with special emphasis on the preparation of frequency distribution table and related terminologies. Methods of representation of all types of geographical data, their appropriateness and advantages and disadvantages have been explained with suitable examples. Keywords Geographical data · Primary data · Secondary data · Data collection · Measurement scale · Data processing · Data representation 1.1 Introduction Geography is a scientific discipline in which various types of primary and secondary data are used voluminously to explain and analyse different geographical events and phenomena. Collected data are organized, represented and interpreted logically and scientifically using various techniques for better understanding and explanation of the spatial distributions and variations of different geographical features on or near the surface of the earth.Geographical data needs appropriate, systematic and logical presentation for better understanding of their cartographic characteristics. Suitable, accurate and lucid demonstration and visualization of geographical data becomes helpful for their correct analysis, explanation and realization. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Maity, Essential Graphical Techniques in Geography, Advances in Geographical and Environmental Sciences, https://doi.org/10.1007/978-981-16-6585-1_1 1 http://crossmark.crossref.org/dialog/?doi=10.1007/978-981-16-6585-1_1&domain=pdf https://doi.org/10.1007/978-981-16-6585-1_1 2 1 Concept, Types, Collection, Classification … Geographical studies emphasize how and why different features vary from one place to another and how spatial patterns of these features change with time. Geogra- phers always begin with the question ‘Where?’, investigating how different features are located on a physical or cultural area, monitoring the spatial patterns and the variations of features. Modern geographical study has shifted to ‘Why?’, specifying why a particular spatial pattern exists, what kinds of processes (spatial or ecolog- ical) have influenced the pattern, as well as why such processes operate. Graphical visualization of various geographical data is the key to realize the nature and char- acter of data, the pattern of their spatial and temporal variations, communicating the knowledge of spatial information, to classify different features and objects and understanding their relationships, formulation of principles which become helpful for proper understanding of the real world. Graphs, diagrams and maps are three important methods of visual representation of data in geography. These three methods are unique and distinctive in terms of the principles and procedures followed and applied for their depiction. In narrow sense, graphical representation means portraying of datausing various types of graphs but in broader geographical perspective, all types of graphs, diagrams andmapping tech- niques are considered as graphical methods of presentation of data. Different graph- ical techniques (graphs, diagrams andmaps) are very popular to the geographers and researchers as they help for better understanding of the world around us by enriching spatial intelligence and capacity of human beings for technical and logical decision- making. Graphical representation of geographical data is very simple, attractive and easily recognizable not only to the geographers or efficient academicians but also to the common literate people. 1.2 Concept of Data A body of information in numerical form is known as data. In other words, data are characteristics or information which are generally numerical in nature and are collected through observation. In technical sense, data means a set of values in quantitative or qualitative form concerning one or more individuals or objects. Data contain some facts and information from which an inference may be made or a reliable conclusion may be drawn. Actually, data are the raw materials of any type of research or investigation. So, the collection of reliable and dependable data is the prerequisite condition for conducting any research or investigation and drawing consistent conclusions. For instance, if we want to analyse the trends and patterns of rainfall distribution and its changes over time of any region, at first daily or monthly rainfall data (in numeric figure) should be collected and then suitable techniques should be applied on those collected data for drawing reliable inferences. Major characteristics of data are: 1. Data must be represented in numerical forms. 1.2 Concept of Data 3 2. All the data must be interrelated with each other. 3. They must be meaningful to the purpose for which they are required. 1.3 Concept of Geographical Data Data that record the locations and characteristics of natural or human features or activities which occur on or near the earth’s surface are called geographical data. Two important characteristics of these data are—i) the reference to geographic space or earth surface (expressed by geographical co-ordinates) and ii) the representation at specific geographic scale. Example Amount of rainfall, temperature, height from mean sea level, number of landslide occurrences, volume of water discharge in river, number of popula- tion, density of population, production of agricultural crops etc. are considered as geographical data as they possess the above mentioned two characteristics. Geographers and researchers use huge amount of statistical information and data for proper and logical understanding and explanation of various geographical features and phenomena on or near the earth’s surface. Different statistical techniques and principles are widely used by them for the correct and scientific processing, analysis, depiction and interpretation of the collected data. 1.4 Types of Data (Geographical Data) Like other data, geographical data are also of two main types on the basis of their nature and characteristics: 1.4.1 Qualitative Data (Attribute) The qualitative characteristic of the information which can’t be measured and expressed in numerical or quantitative terms is called qualitative data or attribute. Attribute refers to the characteristics of the quality of an observation which can be observed, ascertained and classified under different categories but can’t be expressed in quantitative or numerical forms. There are numerous qualitative data which are used in geographical study. For example, skin colour of the people, educational status, efficiency, caste system, attitude and mentality of people etc. are this type of data. All the qualitative data are converted into numerical or quantitative data for efficient and successful application of statistical techniques during geographical investigation. For instance, 100 people are literate, 150 people are of general caste, skin colour of 340 people is black etc. 4 1 Concept, Types, Collection, Classification … In all those cases, the quality or characteristics of the data has been converted into numeric forms. 1.4.2 Quantitative Data (Variable) The characteristic of the information which can be measured and expressed numeri- cally or quantitatively in suitable units is called quantitative data or variable. Variable refers to the quantitative characteristic of an individual or item which takes different values depending on situation and place and these values can always be measured. The variable whose values depends on chance and can’t be predicted is called random variable. For example, average monthly rainfall is 15 cm, number of first order streams in the river basin is 345, average volume of water discharge in the stream is 560 m3/sec, rate of soil erosion is 1mm/year, production of rice is 1200 kg/acre, literacy rate of the country is 65%, fertility rate of any country is 12 persons/year/1000 peoples etc. are the quantitative expression of data. These data are more suitable for the application of statistical techniques and successfully used in geographical analysis. 1.4.2.1 Continuous Variable and Discontinuous or Discrete Variable The variable which can take any value within a specified range is called continuous variable. These variables can be expressed not only in integral part, but also in fraction of any part, however small it may be. When the continuous variables are represented in the form of a series, then it is known as continuous series. Example Amount of rainfall, temperature, height from sea level, velocity of river water, literacy rate, weight of people etc. are the examples of continuous variables. The amount of rainfall may be 25 cm or 25.5 cm or 25.55 cm or any other values. Similarly, the literacy rate may be 68% or 68.25% or 68.59% or any other values. The variable which can assume only some isolated values or integral values is called discontinuous or discrete variable. Discrete variables can’t be expressed in fractional values. When the discontinuous or discrete variables are expressed in the form of a series, then it is called discontinuous or discrete series. Example The number of streams in different orders in a river basin, number of household in a village, number of agricultural or industrial workers in a country, number of peoples affected by flood hazard, number of migrated peoples etc. are the examples of discrete variables. The number of household in a village may be 205 or 206. But, it can’t be 205.65 as a household can’t be divided into parts or fractions. Similarly, the number of people affected by flood hazard may be 450 or 451, but it can’t be 450.5 or 450.75. 1.4 Types of Data (Geographical Data) 5 1.4.2.2 Difference Between Continuous and Discontinuous or Discrete Variables Major differences between continuous and discontinuous or discrete variables are as follows: Continuous variable Discontinuous or discrete variable 1. These variables can take any value within a specified interval 1. These variables can take only some isolated or integral values 2. Can be expressed not only in integral part, but also in fraction 2. Can be expressed only in whole numbers, fractional expression is not possible 3. These variables are measurable but not countable 3. These variables are countable but not measurable 4. Continuity of representation of variables is maintained 4. Continuity of representation of variables is not maintained 5. Variables are expressed by a range, like ∞ ≤ X ≤ ∞ 5. Variables are expressed by a fixed value, like X = 0, 8, 12, 15, etc 6. Example: Height, rainfall, temperature, velocity etc. 6. Example: Number of households, number of students, number of accidents etc. 1.4.3 Uni-Variate Data and Bi-Variate Data Statistical data relating to themeasurement of one variable only are called uni-variate data. For example, amount of organic matter in soil, concentration of SuspendedParticulate Matter (SPM) in air, annual production of rice, income of a family etc. Generally, central tendency, dispersion, skewness and kurtosis etc. are used as the statistical measurements of these variables. Uni-variate data are represented by a letter or symbol ‘x’ and the ‘n’ number of values of ‘x’ variable are expressed by x1, x2, x3, x4, . . . . . . . . . . . . . . . xn. Data relating to the simultaneous measurement of two variables are called bi- variate data (Table 1.1). For example, height from mean sea level and number of settlements, volume of surface run-off and rate of soil erosion, income and expendi- ture of a family, amount of fertilizer used and crop production, distance from Central Business District (CBD) and lower atmospheric temperature etc. Here, one variable is influenced by another variable and thus bi-variate data has an independent and a dependent variable. Co-relation and regression are popular statistical techniques for the analysis of these variables. Bi-variate data are represented by two letters or symbols (xi, yi) and the ‘n’ pairs of values are expressed by (x1, y1), (x2, y2), (x3, y3)………. (xn, yn). 6 1 Concept, Types, Collection, Classification … Table 1.1 Bi-variate data showing depth below ground (m) and air temperature (°C) Sl. No Depth below ground (m) Air temperature (°C) Sl. No Depth below ground (m) Air temperature (°C) 1 0 10.6 9 840 22.1 2 140 11.6 10 690 22.6 3 300 13.3 11 590 23.6 4 170 13.8 12 820 25.5 5 310 15.1 13 1020 26.9 6 340 17.0 14 1150 30.2 7 460 19.3 15 970 30.6 8 550 20.6 16 830 26.2 1.4.4 Difference Between Uni-Variate Data and Bi-Variate Data Major differences between uni-variate and bi-variate data are as follows- Uni-variate data Bi-variate data 1. The word ‘Uni’ means one. Statistical data involving one or single variable is called uni-variate data 1. The word ‘Bi’ means two. Statistical data involving two variables (one independent and one dependent variable) is called bi-variate data 2. It is not associated with causes or relationships 2. It is closely associated with causes or relationships 3. Description of a specific variable is the main purpose of uni-variate analysis 3. Explanation is the main purpose of bi-variate analysis 4. Central tendency (mean, median and mode), dispersion (range, quartile, mean deviation, variance and standard deviation), skewness and kurtosis are the main techniques of uni-variate analysis 4. It uses different techniques like co-relations, regression, comparisons, causes and explanations etc. for the analysis of two variables simultaneously 5. The result of uni-variate analysis is shown in bar graph, pie-chart, line graph, box and whisker plot etc. 5. The result of bi-variate analysis is shown in table where one variable is contingent on the values of the other variable 6. Example: Annual production of rice, annual precipitation, amount of suspended particulate matter in air etc. 6. Example: Relation between height from mean sea level and number of settlements, volume of surface run-off and rate of soil erosion, distance from Central Business District and temperature etc. 1.4 Types of Data (Geographical Data) 7 1.4.5 Independent Variable and Dependent Variable The variable which stands alone and does not depend on other variables, moreover controls other variables is called independent variable. Independent variables may be one or more in number. This variable is expressed by ‘x’ and is shown along ‘X’-axis in graph. The variable which depends on other variables and is affected by them is called dependent variable. The value of dependent variable undergoes changes due to the change of value of independent variables. This variable is expressed by ‘y’ and is shown along ‘Y’-axis in graph. Example In the above data (Table 1.1), air temperature changes with the change of depth below ground. So, the air temperature is dependent variable and the depth below ground is independent variable. Again, the production of agricultural crops depends on availability of water, amount of fertilizer used, labour used etc. Here, crop production is dependent variable but availability of water, amount of fertilizer used, labour used are independent variables. 1.4.6 Difference Between Qualitative Data (Attribute) and Quantitative Data (Variable) The following are the differences between qualitative and quantitative data: Qualitative data (Attribute) Quantitative data (Variable) 1. Data representing the qualitative characteristics of the statistical information 1. Data representing the quantitative characteristics of the statistical information 2. Data can be observed, ascertained and classified under different categories but can’t be expressed in numerical form 2. Data attain different values which can easily be measured and expressed in numerical form 3. Data should be transformed into quantitative forms before used in statistical analysis 3. Different statistical techniques can easily be applied on those data 4. Example: Educational status, caste system, attitude etc. 4. Example: Amount of rainfall, volume of water discharge, rate of sediment transport etc. On the basis of sources of collection, geographical data are of two types. 1.4.7 Primary Data Primary data are those data which are collected for a specific purpose directly from the field of investigation, and hence are original in nature. These types of data are collected originally by the individual, group or authority who requires the data for their own use and treatment. These data have not been used in quantitative research previously. These are called raw data or basic data as they are directly collected from 8 1 Concept, Types, Collection, Classification … the field by the field-workers, investigators and enumerators. The level of accuracy and reliability of these data depend on the knowledge, efficiency, consciousness and mentality of the researcher or investigator and also on the methods of data collection. The places or sources from which primary data are collected are known as primary sources. Examples The data collected from measurements of river depth, width, water velocity, water discharge, tidal water level etc. directly in the field by the researcher using various instruments are primary data. Similarly, various socio-economic data (caste, religion, literacy rate, job opportunity, income, expenditure, marital status, immunization status etc.) collected from household survey using survey schedule by the researcher are the examples of primary data. 1.4.8 Secondary Data The data which have previously been collected and published by someone for one purpose but subsequently treated and utilized by another one in a different connection are called secondary data. Secondary data are actually collected and published by the organizations other than the authorities who need them subsequently for their use. So, primary data of one organization become the secondary data of other organization who later want to use those data. Because of this, secondary data are not considered as basic data. The sources from which secondary data are collected are known as secondary sources. Examples Data, collected from any published books and journals, from different maps, from internet sources, from different government and non-government offices are the examples of secondary data. The Statistical Abstract of India and Monthly Abstract of Statistics, published by Central Statistical Organization and other publications of Government are different sources of secondary data. 1.4.9 Advantages of Use of Primary Data Over the Secondary Data There is no hard and fast rule about which data should be used in geographical research or investigation. The nature, scope and purpose of the geographical enquiry should be taken into consideration whether primary data should be used or secondary data are to be utilized. Though the utilization of secondary data is more convenient and economical, but the use of primary data is preferableandmuch safer from several standpoints: (a) Primary data usually show more detailed information and a description of the investigation along with the unit of measurement. 1.4 Types of Data (Geographical Data) 9 (b) The methods, sources and any approximations used for the collection of data are clearly and specifically mentioned in those data. So, it can be decided in advance how much reliance can be given on those data while they are being used. (c) Primary data are more reliable, authentic and accurate than secondary data as the later contain errors because of transcription, rounding etc. Inspite of this, the secondary data are used due to the following reasons: (a) Primary data are not available or can’t be collected directly due to limitations of time and money during data collection. (b) To compare the data collected over a long period of time, the use of secondary data is required. Utmost accuracy is not so much necessary in these cases. 1.4.10 Difference Between Primary and Secondary Data Actually, primary data and secondary data are same because the former is trans- formed into the later with the advancement of time. The major differences between primary and secondary data are: Primary data Secondary data 1. Primary data are collected directly from the field or area under study by the investigator 1. Secondary data are collected from any published books or journals, offices, internet sources, institutions etc. 2. Data are the result of direct observations and interactions in the study area 2. Data are mainly the result of publications 3. Trained and efficient manpower is needed during the collection of primary data 3. Trained and efficient manpower is not essential for the collection of data. Non-trained person can collect the data 4. Quality of data is largely affected by the knowledge, efficiency, consciousness and mentality of the investigator 4. Researcher or investigator has no role to control the quality of data 5. Data are more accurate, authentic and reliable 5. Data are less reliable due to the possibility to be erroneous 6. Data are always collected in original unit 6. Data can be collected in original unit or in any other converted unit, like aggregate, ratio, average, percentage etc. 7. Collection of data is time consuming, costly and sometimes becomes risky 7. Data collection is less time consuming and cost effective 8. These data are at the first stage of their utilization and numerical techniques are not applied previously 8. Different numerical techniques have been applied previously in those data, i.e. they are in second, third or any other stages of their utilization 9. Primary data is preferred more by the researchers in statistical investigations because of its several advantages 9. Secondary data is used in those cases when primary data is unavailable or can’t be collected directly 10 1 Concept, Types, Collection, Classification … 1.5 Methods of Data Collection There is no hard and fast rule in adopting a specific method for the collection of geographical data. The method of data collection is decided by the objectives and purposes of the study. Young (1994) has divided the data sources into two classes: (a) field sources and (b) documentary sources. Field sources are the sources of primary data whereas the documentary sources are the sources of secondary data. 1.5.1 Methods of Primary Data Collection Generally, five methods are followed for the collection of primary data: 1. Observation method 2. Interview method 3. Sampling method 4. Experimentation method 5. Local sources method. 1.5.1.1 Observation Method Continuous and intensive observation of different objects, events or phenomena is an important method of primary data collection. The success of this method depends on the knowledge, efficiency and capability of the researcher or investigator. There are three types of observations: Direct Observation Method The researcher or investigator collects the necessary information directly by himself or herself beingpresent in thefield. The researcher visits the area to be studied keeping some hypotheses in his/her mind. After intensive and careful field observation, some new ideas and experiences are added to the previous hypotheseswhich help to develop the theory and make the collected primary data reliable and relevant. Example In case of landslide study, the researcher or investigator directly visits the land slide affected area and collects different data regarding the total area affected by land slide, volume of materials displaced, length of sliding, slope of the land, composition of materials etc. Advantages and Disadvantages of Direct Observation Method Advantages (i) More reliable and accurate data can be collected without any biasness. (ii) Usable for small area investigation. 1.5 Methods of Data Collection 11 (iii) Privacy of data can be maintained. (iv) Clarity and homogeneity of data can be maintained. (v) Collection of complete data is possible. Disadvantages (i) Probability of wastage of time and money. (ii) Method can’t be applied in large study area. (iii) Sometimes, the self-feelings, emotions, mentality and prejudices of the researcher affect the collection method and quality of the data. (iv) Sometimes, the data collection becomes risky. Indirect Observation Method When the responder is not agreeing to provide information or to answer the questions accurately, then this method is applied. In this situation, the responder is avoided and information is collected from the associated third person. Data is collected by the researcher himself or herself or by the enumerator appointed by the researcher. Advantages and Disadvantages of Indirect Observation Method Advantages (i) Less time consuming and cost effective. (ii) Effective for the collection of qualitative data. (iii) Data can be collected in risky condition. (iv) Effective for data collection in large population. (v) Unbiased data collection is possible. Disadvantages (i) The data are not as reliable as collected from the associated third person. (ii) Information provider may be biased. (iii) Data may be biased due to negligence of information provider. (iv) Collected data may be erroneous due to lack of trained enumerator. Participation Observation The researcher or investigator collects the information by staying, living and inter- acting with the people of the area under study. In this method, the researcher makes a close and intricate relationship with the local people of the area and observes their daily activities and life style intensively.Questions are not asked to the people but data are collected by observations, feelings and individual judgements of the investigator. Example For the intensive study of the livelihood pattern and social adjustments of the people of any indigenous tribal society, the researcher or investigator live and makes a close relationship with the people of the society for collecting required information for the fulfilment of the purpose. 12 1 Concept, Types, Collection, Classification … Advantages and disadvantages of participation method Advantages (i) Reliable, unbiased and accurate data are collected as the researcher collects the data by making close relation with the local people. (ii) Simple, easy and unambiguous technique of data collection. (iii) Effective in qualitative data collection. (iv) Collection of data about any specific group of people becomes possible. (v) Researcher can change and modify the hypothesis of research easily. Disadvantages (i) Complete observation and understanding about the research area is difficult as it is totally unknown to the researcher. (ii) It is a valiant and risky method for data collection. (iii) Time consuming and costly because the researchers have to stay in the research area for a certain period of time. (iv) Prior experience and training is required for the collection of data. (v) Limited applicability in large research area. 1.5.1.2 Interview MethodIn interview method, information is collected by the conversation between investi- gator or enumerator (interviewer) and the informant (interviewee). The interviewer makes a close interaction and face-to-face discussionwith the informant for collecting the data. Interview methods are of three types. Interviewing by Questionnaire Method In thismethod, the enumerators interview the concerned persons directly or indirectly and ask questions to collect information. The information is gathered generally on standard set of questions. Before collecting the data, a standard questionnaire is prepared by the researcher. Example If a researcher wants to study the impact and management of flood in any flood-prone area, he/she will prepare a standard questionnaire considering the following points, like causes of flood, frequency of flood, duration of water logging during flood, area affected by flood, problems of flood, sources of food and drinking water during flood, flood controlling measures taken by governmental and non- governmental agencies, precautions to avoid flood, any advantages from flood etc. Characteristics of Standard Questionnaire (a) The questions should be meaningful, concise, clear and easy to understand to the interviewee. 1.5 Methods of Data Collection 13 (b) The number of questions should be limited and they will be arranged sequentially and systematically. (c) Questions should be impartial and unbiased to avoid the hesitation of the interviewee. (d) All the questions should be relevant and will be sufficient for the fulfilment of the purpose of the research. (e) Questions should be free of religious, political and other prejudices. (f) Calculative questions should be avoided. Questionnaire method is of two types. Direct Questionnaire Method In this method, the researchers or the enumerators appointed by the researcher go personally to the persons or to the sources fromwhom (which) the information should be collected. The enumerators interview the concerned persons and ask the questions directly (face to face) during the time of data collection. This method is also called interview schedule method. Advantages and disadvantages of Direct Questionnaire method Advantages (i) Data confirm high degree of accuracy as the investigators or enumerators have direct contact with the people (interviewee). (ii) Data are more reliable and dependable. (iii) The purpose of the study and the meaning of each question can clearly and patiently be explained to the interviewee. (iv) It helps to collect the relevant information only. (v) Privacy of data will be maintained. (vi) Testing of data accuracy is possible. Disadvantages (i) It is very expensive, time consuming and complex technique of data collection. (ii) Difficult to apply for large observations in extensive area. (iii) Untrained and inefficient enumerator may collect erroneous information. (iv) It allows the personal prejudices of the enumerators or the investigators to affect the quality of the data and the inferences to be drawn. Postal Method of Questionnaire Survey A standard questionnaire is prepared and sent to different addresses by post for answering the questions. Generally, all the questionnaires are accompanied by a letter of explanation and self-addressed envelopes in order to receive the information properly at the earliest. This method is widely used for data collection in planning process. 14 1 Concept, Types, Collection, Classification … Advantages and disadvantages of Postal Questionnaire method Advantages (i) Thismethod helps extensive investigations and covers the large fields of study. (ii) It is a very easy and quick method. Data can be collected within very short time. (iii) It is cost effective for data collection. Only postal charges are required. (iv) It is free from personal bias of the enumerators or investigators. (v) Data can be easily collected from long distances. (vi) Very useful to judge the national point of view. Disadvantages (i) Someof thequestionnairesmaynot be answered and returned to the researcher. (ii) Questionnairesmaybe returned to the researcherwithout giving proper answer and filled in. Wrong answer may be given without understanding proper meaning of the questions. (iii) Method can’t be applied to the informantswho are illiterate orwho are ignorant about the importance and requirement of the information. (iv) The accuracy of the information can’t be verified. So, data are not so much reliable and dependable. Interviewing by Informal Method In this method, the researcher or investigator collects the required information out of the inadvertence of the informants.Generally, thismethod is applied for the collection of information about any specific phenomenon or event.When the informants are not agreeing or hesitating to provide sufficient information, then the investigator attempt to collect necessary information by immaterial and superfluous discussion with the informants. The informants explain the actual fact unintentionally to the investigator which becomes important information to the researcher. No questionnaire is needed for collecting data by this method. The researcher or investigator collects the data by asking the questions to the informants from his/her memory. Example For conducting a study about the dimension and status of illegal coal mining in any region, the researcher needs to collect information regarding the area and number of illegal coal mines, number of people engaged in this work, amount of daily coal withdrawal, means of mining, major uses of the mined coal, any problems faced by the miners etc. Advantages and disadvantages of Informal interview method Advantages (i) Easy data collection by this method by the extrovert persons. (ii) Qualitative data can easily be collected by this method. (iii) Collected data are more reliable and dependable. (iv) Behaviour of the informants is expressed accurately. 1.5 Methods of Data Collection 15 Disadvantages (i) Trained and efficient investigator is required. (ii) Consciousness of the enumerator or investigator is essential. (iii) Very costly and time consuming. Interviewing by Telephone The researcher collects the information by telephonic interview of the informants. When enormous information is needed urgently within very short time, then this method is followed. Advantages and disadvantages of Telephone interview method Advantages (i) Collection of huge information within very short time and spending little money. (ii) Lesser number of investigators is needed. (iii) Data are reliable and collected systematically. (iv) Very suitable to apply in small research area. Disadvantages (i) Applicability of this method depends on the availability of telephonic commu- nication. (ii) Informants are not always available in urgent condition. 1.5.1.3 Sampling Method Sampling is a very importantmethod for the collection of different primary data. Reli- able statistical inferences can easily be drawn about a large number of observations (population) under study by testing small samples collected from the population. The members of the population which are selected for statistical testing are called samples and the technique of sample selection is called sampling. Sampling tech- nique is used very popularly and significantly for the collection of data to be used in different geographical study and research. Example For assessing the quality of water of a lake, required number of water samples should be collected from different parts and depths of the lake. Similarly, the socio-economic status of the people of a large slum area should be studied by collecting required data from the slum households. When the collection of data from all slum households is not possible due to time and money constraint, then slum households should be selected by suitable sampling method. 16 1 Concept, Types, Collection, Classification … Advantages and Disadvantages of Sampling Method Advantages (i) Data collectionis cost effective and less time consuming. (ii) It is applicable for all types of geographical survey. (iii) Minimum number of investigators is required. (iv) All the factors regarding survey and data collection can bemonitored carefully. (v) Trained and efficient researcher can solve different critical problems by collecting data in this method. Disadvantages (i) Collection of data by sampling technique requires trained and efficient enumerator or investigator. (ii) Presumptions, prejudices, partiality and negligence of the enumerator or investigator will affect the selection of samples. (iii) Wrong sampling technique or collection of wrong samples makes the result erroneous and less applicable. (iv) Results from sample study may not always reflect all the characteristics of the whole population. 1.5.1.4 Experimentation Method It is an important part of the sampling method for primary data collection. In this method, the researcher or investigator collects the required samples from the study area, analyse and test the collected samples in the laboratory or research centre and generates numerous primary data. Example For knowing the mineral composition of soils of any region, we have to collect the required number of soil samples from the concerned study area, and then the collected samples should be tested and experimented in the laboratory using different instruments (preferably using X-Ray Diffraction technique) and chemicals. Similarly, if we want to know about the arsenic contamination of groundwater of any region, then sufficient numbers of groundwater samples should be collected from wells, tube-wells or any other sources for testing them in the laboratory to generate primary data. Advantages and Disadvantages of Experimental Method Advantages (i) It is an ideal method for generating data in the laboratory. (ii) Numerous data can be generated within very short time. (iii) All types of variables can be controlled and monitored easily. (iv) It needs minimum number of enumerator or investigator. Disadvantages (i) Generation of data in this method is very costly. 1.5 Methods of Data Collection 17 (ii) It can’t be applied in all types of geographical research. (iii) Trained and efficient investigators are required for the utilization of different instruments and peripherals in the laboratory. 1.5.1.5 Local Sources Method The researcher or any institution appoints the local people of the research area as the enumerator or investigator for collecting data about any geographical phenomenon or event. Being local residents, the enumerator comprise clear-cut and explicit idea about the study area. They collect all the required information by direct observations about any phenomena and send the collected data to the concerned researcher or institution. Generally, different types of regional geographical data are collected by this method. Example Measurement of daily water discharge of a stream, measurement of hourly tidal water level in a tidal river, collection of weather-related data (atmospheric temperature, amount of rainfall, wind direction, wind velocity, air pressure, humidity etc.) at any weather station should be done by appointing the local residents as the investigator. In socio-economic survey, the study of daily livelihood pattern of a particular group of people can be made following this method. Advantages and Disadvantages of Local Sources Method Advantages (i) Data can be collected continuously and instantly. (ii) Fewer enumerators can collect the required data. (iii) Decision can be made quickly. (iv) Place-wise data can be collected from an extensively large research area. Disadvantages (i) Method of data collection is very costly. (ii) Untrained and inefficient investigator can collect erroneous information. (iii) Partiality and prejudices of the investigator deteriorates the quality of data. (iv) Sometimes, the data are collected based on assumptions which makes the data undependable. 1.5.2 Methods of Secondary Data Collection There are no proper methods for the collection of secondary data. Generally, secondary data are collected from two main sources: 1. Published sources 2. Unpublished sources. 18 1 Concept, Types, Collection, Classification … 1.5.2.1 Published Sources Numerous secondary data are collected from the published reports, records and docu- ments of government offices and other non-government departments and agencies. Government and non-government departments and agencies prepare and publish different reports, records and documents on various subjects. Data are often collected by the researcher or investigator from those published sources. Example The main sources of collecting data under this method are (a) publications of government, (b) reports of different commissions and committees, (c) reports and publications of trade associations and chambers of commerce, (d) market reports and business bulletins of stock exchanges, (e) economic, commercial and technical jour- nals, (f) publications of researchers and research institutions etc. Secondary data are published by different organizations like United Nations Organization (UNO), Inter- national Monitory Fund (IMF), Food and Agricultural Organization (FAO), World Bank, UNESCO, UNICEF, Indian Statistical Institution (ISI) etc. In India, National Remote Sensing Agency (NRSA), SIO, National Atlas and ThematicMappingOrga- nization (NATMO), Geological Survey of India (GSI), IMO, SSI etc. are important sources of various geographical maps and data. 1.5.2.2 Unpublished Sources Sometimes, the collected primary data are not properly published by the collector, called unpublished data. The researcher collects those unpublished data for their own need from the collector individual or institution through personal connection and relationship. For example, unpublished thesis paper of a scholar, records and documents stored in different governmental and non-governmental offices etc. are unpublished sources of secondary data. 1.5.2.3 Advantages and Disadvantages of Secondary Data Collection Advantages (i) It helps in furnishing reliable information and reliable data. (ii) This method of data collection is inexpensive. The cost of data collection is borne by governmental and other non-governmental departments, offices and agencies. Disadvantages (i) The unit of the published data may not be same as it was in the collected primary data. So, the data collected from published sources may not serve the purposes. 1.5 Methods of Data Collection 19 (ii) The basis of classification and the method of collection of data may also be different in governmental and non-governmental sources from which the secondary data are collected. Due to this, the data may not be appropriate for the fulfilment of the purpose of the researcher. Detailed and careful scrutiny and verification of the data before putting them into use is the prerequisite condition of this method. The researcher or user of the data should scrutinize the data cautiously in order to knowwhether the data are appropriate for the purpose for which they are intended. Before the collection of those data, the following points may be taken into consideration: (1) the scope and objectives of the study for which the data were actually procured (2) units of the collected primary data (3) methods adopted for the collection of data (4) degree of authenticity and accuracy of the data (5) honesty and reliability of the authorities who collected the data. 1.6 Measurement Scales in Geographical System Measurement can be defined as assigning the names and quantifying the earth surface features and working out the relationship using them. In general, measurement refers to the quantitative description (numerical value) of some properties or attributes of objects or events for comparing one object or event with others. It offers a platform to describe the attributes and to communicate this description with others. Measure- ments or data are the rawmaterialsof descriptive and inferential statistics with which statistical techniques do work. Data includes facts or figures recorded as an outcome of measuring or counting a system and from which reliable inferences are made. After the procurement and recording of the data regarding spatial or temporal distribution of any phenomenon or event or object, it needs to be properly catego- rized and summarized in numerical forms. This method of categorizing the collected raw data involves four different processes of measurement providing four types of ‘number scales’. These are: (a) Nominal scale (b) Ordinal scale (c) Interval scale (d) Ratio scale 1.6.1 Nominal Scale It is the basic and simple form of measurement in which data are expressed in terms of identity only like male or female, lowland or highland, unreserved or reserved category, present or absent etc. So, the nominal scale is similar to the binary scale in which the presence of any character or phenomenon is expressed by the value ‘1’ 20 1 Concept, Types, Collection, Classification … and the absence by ‘0’ (Pal 1998). Tossing of a coin which gives either head or tail is the classical example of a nominal scale. 1.6.1.1 Characteristics of Nominal Data a. Data should be exhaustive (includes all events or phenomena under study) and mutually exclusive (no value is laid in two or more category). b. The items in each category are counted and the total is represented by a number. c. Data can’t be manipulated by any basic mathematical operation (addition, subtraction, multiplication, division etc.). d. It is termed as count data in the form of frequencies. e. All observations or items within each category are treated as same. 1.6.1.2 Application in Geographical Study It is used for the determination of equality or differences between geographical phenomena or events. ‘Mode’ is used only as the measurement of central tendency in nominal data. Frequency, binomial and multinomial expression is easy in this type of data. Examples Classification of land use pattern (forest land, cultivated land, built-up land etc.); soil, rock or mineral classification etc. belong to nominal scale. Table 1.2 shows the number of male and female students in different departments as an example of nominal scale measurement. 1.6.2 Ordinal Scale It is the level of measurement superior to nominal scale. In this method, there is sufficient information to place the data before or after another along a scale in rank order either individually or in groups. The differences between objects or events by their identities can easily be established by this method. The statement X < Y < Z Table 1.2 Nominal data (Number of male and female students in different departments) Departments Number of students Male Female Mathematics 35 15 Statistics 28 22 Physics 32 18 Chemistry 29 21 Geography 27 23 1.6 Measurement Scales in Geographical System 21 indicates that there are three values or classes of any object or phenomenon in which value or class X is less than value or class Y and again Y is less than value or class Z. 1.6.2.1 Characteristics of Ordinal Data a. The direction and relative position of values on this scale are known. b. The differences between objects or events by their identities can easily be established. c. Application of mathematical operation (addition, subtraction, multiplication, division etc.) is not possible. d. The actual differences between values can’t be understood. e. Some data are inherently ordinal in nature. 1.6.2.2 Application in Geographical Study It is applied for the determination of greater or lesser values of observations related to any geographical phenomena or events, i.e. rank of different values of any observation can easily be identified in this scale. Mode, median, percentile and inter-quartile range (quartile deviation) are widely used as the measurement of central tendency and dispersion of values. Examples Classification of families of any region into rich, upper class, upper- middle class, middle class, lower-middle class and lower class according to their socio-economic status is an example of ordinal scale. Similarly, the ranking of Indian states according to the literacy rate of the people is done using this scale (Table 1.3). Moh’s scale of hardness of minerals is another example of ordinal scale (Table 1.4). 1.6.3 Interval Scale Interval scale consists of measures for which there are equal intervals between each measurement or each group. Thus, interval scales are numeric scales in which not Table 1.3 Ordinal data (Literacy rate of few Indian states, 2011) Name of states Literacy rate (%) Rank Name of states Literacy rate (%) Rank Kerala 93.91 1 Maharashtra 82.91 6 Mizoram 91.58 2 Sikkim 82.20 7 Tripura 87.75 3 Tamil Nadu 80.33 8 Goa 87.40 4 Nagaland 80.11 9 Himachal Pradesh 83.78 5 Manipur 79.85 10 22 1 Concept, Types, Collection, Classification … Table 1.4 Moh’s scale of hardness of minerals (Ordinal scale) Hardness Mineral Hardness Mineral 1 Talc 6 Feldspar 2 Gypsum 7 Quartzite 3 Calcite 8 Topaz 4 Fluorite 9 Corundum 5 Apatite 10 Diamond only the objects are given identities and ranked like nominal and ordinal scales but also the exact differences or intervals between objects in terms of their property are known. This is capable of comparing the differences between a number of pairs or values to specify the exact location of the objects along a continuous scale. 1.6.3.1 Characteristics of Interval Data a. Direction and magnitude of position on scale are known. b. The exact difference between any two values on the scale is known but there is an arbitrary point and a unit of measurement. c. The interval scaled data can easily be added or subtracted but multiplication or division is not possible. d. It represents precise idea about all the values of the data. e. In this scale, the value of zero is arbitrary; absolute zero (true zero) is not used. For example, the zero (0) value in pH scale is arbitrary. 1.6.3.2 Application in Geographical Study It is used for the determination of equality or differences of intervals of the values in arithmetical sense. Interval scales are very applicable as the area of statistical analysis of geographical data sets opens up. Central tendency can be measured by mean, median or mode; variance, mean deviation and standard deviation can also be used as the measure of dispersion. Examples Celsius temperature scale is the standard example of an interval scale as the difference between each value is same. The difference between 50 and 40 degree Celsius is a measurable 10 degree Celsius, as is the difference between 90 and 80 degree Celsius. Time is another classical example of interval scale in which the increments are recognized, consistent andmeasurable. For instance, time in years AD or BC. Longitude, compass directions are also the examples of interval scale. 1.6 Measurement Scales in Geographical System 23 1.6.4 Ratio Scale All the requirements of the interval scale are met in ratio scale, and in addition, it has an absolute zero scale (Pal 1998). For instance, rainfall scale (either in centimetres or in inches) has a true zero base. Thus, if a place ‘A’ receives 60 cm rainfall and place ‘B’ receives 180 cm rainfall in a year, we can conclude that the place ‘B’ receives three times more rainfall in a year than the place ‘A’. Ratio scale tells us about the order of values, exact value between units and allows for a wide range of application of both descriptive and inferential statistics. 1.6.4.1 Characteristics of Ratio Data a. Two measurements bear the same ratio to each other independent of the units of measurement. b. Data are amenable to all types ofmathematical operations (addition, subtraction, multiplication and division) and to many forms of statistical analysis. c. Because of its absolute zero, the ratio scale contains maximum amount of information about any entity. d. All the ratio variables are also interval variables but all intervalvariables are not necessarily ratio variables. 1.6.4.2 Application in Geographical Study Ratio scale offers a wealth of possibilities when statistical data and techniques are used in geographical analysis. It is applied for the determination of equality or differences of ratios of the values. Central tendency can be measured by mean, median or mode; different measures of dispersion, for example, standard deviation and coefficient of variation can also be easily computed from ratio scales. Examples Good examples of ratio scales are the measurement of height, weight, length, streamwater velocity, slope, income of people etc. In all these measurements, zero point is identical and absolute. The major characteristics of these four types of scale of measurement are shown in Table 1.5. 1.7 Processing of Data Processing of data is very important and the prerequisite condition for the represen- tation, analysis, explanation and interpretation of the collected data. The main aim of data processing is to make the data simple and comprehensible to all. Processing 24 1 Concept, Types, Collection, Classification … Table 1.5 Characteristics of different scales of measurement Characteristics Nominal scale Ordinal scale Interval scale Ratio scale The order of values is known No Yes Yes Yes Counts or frequency distribution Yes Yes Yes Yes Mean No No Yes Yes Median No Yes Yes Yes Mode Yes Yes Yes Yes Quantification of difference between each value No No Yes Yes Addition or subtraction of values No No Yes Yes Multiplication or division of values No No No Yes Absolute or true zero No No No Yes of data is nothing but the classification, arrangement and summarization of data. Galtung (1968) mentioned that ‘Processing of data refers to concentrating, recasting and dealing with data such that they become as amenable to analysis as possible’ (Khan 2006). The main procedure of data processing starts after the editing and coding of the data. Identification of errors in the collected data and their rectifica- tions is called data editing. Soon after the collection of data starts, arrangements should be made to receive and verify the completed forms sequentially. Generally, many discrepancies, errors and omissions are observed in these completed forms. The defective forms should immediately be transferred back for necessary correc- tions. In case of plentiful errors and inconsistencies, the collected data should be cancelled and new data should be collected again. The completeness, uniformity, legibility and comprehensibility of the collected data should be checked carefully during the time of data editing. Data coding is executed after the editing of the data. Coding of data is the method of assigning numbers or symbols within the data. In close-ended question, data coding is performed before the collection of the data, but in open-ended question, data coding is done after the editing of the data. Threemethods are followed by the geographers and researchers for the processing of geographical data: 1.7.1 Classification of Data The method of systematic arrangement of the data into different classes and groups based on their common characteristics and similarities is known as data classifica- tion. In the collected data, there is a group which have homogeneous and common characteristics and other groups of data are dissimilar from each other in terms of their characteristics. The homogeneous items are categorized into one groupwhile the dissimilar items into another group. According to Kapur (1995), ‘Classification is the process by which individuals and items are arranged into groups or classes according 1.7 Processing of Data 25 to their resemblances’. Good and useful classification of data should possess the unit of being exhaustive,mutually exclusive, stable andflexible.Data classification should also be specific and must not be ambiguous and clumsy. 1.7.1.1 Objectives of Data Classification Major objectives of the classification of data are: (i) To ensure the sequential and systematic arrangement of data based on their characteristics, resemblances and affinity. (ii) The nature, characteristics and actual conditions of the data should be under- stood and explained clearly by highlighting their similarity and dissimilarity in classification. (iii) Simplification and summarization of data by reducing their complexities and ambiguities. (iv) To make the data suitable for comparison and establishment of their relation- ship. (v) To make the data meaningful, comprehensible and easily applicable for depicting relevant inferences. 1.7.1.2 Characteristics of Ideal Data Classification Though there is no hard and fast rule for the classification of data, but the following points should be taken into account during the classification of data: (a) Homogeneity: Homogeneous and common values should be taken into one class and uncommon values into other class. (b) Purpose oriented:Collected data should be classified in tune with the purpose of the research or investigation. (c) Clarity: Classification should be clear, simple and easily understandable to all, complexities should be avoided. (d) Completeness: All the items or values should be included in the classification carefully. No item should be eliminated during data classification. (e) Mutually exclusive: Classes or groups should be mutually exclusive; no item should be included in more than one class. (f) Flexibility:Though, stability of the data classification is important, yet the clas- sification should be made in such a flexible way that further changes become possible. 1.7.1.3 Types of Classification Generally, the classification of data is made based on the nature and characteristics of the collected data and the objectives of the study or investigation. There are four types of classification of data: 26 1 Concept, Types, Collection, Classification … Table 1.6 Geographical classification of data (Population densities of some states in India, 2011) Sl. No Name of states Population density (persons/sq. Km.) Sl. No Name of states Population density (persons/sq. Km.) 1 Bihar 1102 6 Arunachal Pradesh 17 2 West Bengal 1029 7 Mizoram 52 3 Kerala 859 8 Sikkim 86 4 Uttar Pradesh 828 9 Nagaland 119 5 Tamil Nadu 555 10 Manipur 122 Table 1.7 Chronological classification of data (Decadal growth rate of population in India, 1901– 2011) Sl. No Name of States Decadal growth rate of population Sl. No Name of States Decadal growth rate of population 1 1901 – 7 1961 21.64 2 1911 5.75 8 1971 24.80 3 1921 –0.31 9 1981 24.66 4 1931 11.00 10 1991 23.87 5 1941 14.22 11 2001 21.54 6 1951 13.31 12 2011 17.64 Geographical Classification (Based on Location or Space) In this type, data regarding phenomena, events or objects are always measured and classified based on their geographical distribution and location. It is also called locational or spatial classification. For example, classification of state-wise produc- tion of rice in India, state-wise population density in India (Table 1.6), district-wise Scheduled Caste population in West Bengal etc. Chronological Classification (Based on Time or Period) In this type of classification, data are measured and arranged in sequence of time (chronologically) and classified according to the time bywhich the data aremeasured. The change of phenomena or events with respect to time is represented in this classi- fication. For example, classification of month-wise water discharge in a river, year- wise total rainfall in India, decadal growth rate of population in India (Table 1.7), year-wise production of coal in India etc. 1.7 Processing of Data 27 Fig. 1.1 Qualitative classification of data (population) Table 1.8 Quantitative classification of data (Monthly income of a group of people) Sl. No Monthly income Number of people Sl. No Monthly income Number of people 1 Rs.5000–10000 25 6 Rs.30000–35000 14 2 Rs.10000–15000 28 7 Rs.35000–40000 21 3 Rs.15000–20000 18 8 Rs.40000–45000 12 4Rs.20000–25000 30 9 Rs.45000–50000 10 5 Rs.25000–30000 45 10 Rs.50000–55000 15 Qualitative Classification (Attribute) This type of classification is based on descriptive characteristic or quality of data and is in accordance with non-measurable terms, like occupation, employment, religion, caste, literacy etc. (Fig. 1.1). If one group possesses a particular attribute, the other groupwill possess the opposite or other attribute. For example, if one group of people is literate, the other groupwill be illiterate. Similarly, if one group of people is honest, the other group will be dishonest. Quantitative Classification (Numerical) Quantitative characteristic of data (variable) which can bemeasured and expressed in numerical forms is the main basis of this type of classification. For example, monthly income of a group of people can be measured numerically, such as Rs. 12,000, Rs. 16,000, Rs. 20,000, Rs. 30,000 etc. (Table 1.8). Similarly, the monthly expenditure of people; height, weight and age of people can also be measured in numeric terms. 1.7.2 Tabulation of Data Tabulation is the orderly and systematic arrangement of numerical data presented in columns and rows in order to extract information. It summarizes the data in a logical and orderly manner for the reasons of presentation, comparison and interpretation andmakes the data brief and concise as they contain only the relevant figures.Gregory and Ward (1967) mentioned that ‘Tabulation is the process of condensing classified data in the form of a table, so that it may be more easily understood and so that any comparisons involved may be more readily made’. The main aim of tabulation 28 1 Concept, Types, Collection, Classification … of data is to put the whole data set in a concise and logical manner. Connor (1937) stated that ‘Table involves the orderly and systematic presentation of numerical data in a form designed to elucidate the problem under consideration’. 1.7.2.1 Essentials of an Ideal Table No idealmethod is there in the tabulation of data. Skill of data tabulation is generally a function of years of experience of the researcher. Nevertheless, the researcher should follow the following rules while tabulating the statistical information (Fig. 1.2): 1. Table number: When many tables are used, then they should be numbered like Tables 1 and 2 etc. for future reference. In case of several columns (more than four), they should also be numbered serially (Das 2009). 2. Title of table: Each table must have a clear and concise title which will convey the contents of the table. 3. Stub: Stub is the left-most column of the table which is clear and self- explanatory and used for representing the items and their headings. It is generally marked with rows in which an item is mentioned. 4. Caption: It is the title for columns other than the stub consisting of the upper part of the table. Fig. 1.2 Different parts of an ideal table 1.7 Processing of Data 29 5. Body of the table: It is the main part of the table containing the clear and distinctive figures and data which are displayed in the table. 6. Unit ofmeasurement:Units ofmeasurement likeKg. forweight, ft. for height, Rs. for price etc. must be clearly mentioned in the column headings. 7. Simplicity: The table must be clear and simple keeping a balance between length and breadth in which the figures or values should be shown distinctly. The main columns and sub-columns should be indicated by heavy lines and light lines, respectively. The important figures in the table should be indicated by putting them in prominent place or in bold type. 8. Arrangement: The arrangement of data in the table depends on the nature of data, type of the table and the purposes for which they are intended. Data should be arranged in a logical sequence in the table. For example, the time series data must be arranged chronologically. 9. Comparability: The data should be arranged in the table in such a way that they become easy to compare. Comparable columns of figures should be kept as close as possible. In case of percentage figures, the basis of calculation of percentage should be mentioned near the figures to which they relate. Large number of figures should be rounded and indicated in thousands, millions etc. 10. Source: If the data have been collected and compiled from other sources, the source must be mentioned clearly in the foot-note. 11. Total: The total numbers of column must be mentioned at the bottom of the table and the row totals, if useful should also be mentioned. 12. Foot-note: The specific explanation about the figures should be mentioned as foot-note by using symbol (*), numbers (1, 2, 3,….) or English small letters (a, b, c,…). Example A blank table has been prepared (Table 1.9) to show the discharge of water in pre-monsoon, monsoon and post-monsoon seasons by dividing into respective months during high and low tide at the places of Kolaghat, Soyadighi, Anantapur, Pyratungi, Dhanipur and Geonkhali on Rupnarayan River. 1.7.2.2 Types of Table On the basis of purpose and uses, tables are of two types: General Purpose Table General purpose table, also called reference table, is generally voluminous in size and used as a repository of information. Special care and attention is required for the preparation of such table and the tabulation of information in it, because the informationneeded for referencemaybeobtained readilywithout any loss of time and effort (Bose 1980). These tables are prepared frequently by the concerned authority. 30 1 Concept, Types, Collection, Classification … Ta bl e 1. 9 B la nk ta bl e to sh ow se as on -w is e w at er di sc ha rg e in R up na ra ya n R iv er Se as on M on th W at er di sc ha rg e (m 3 /s ec ) K ol ag ha t So ya di gh i A na nt ap ur P yr at un gi D ha ni pu r G eo nk ha li H ig h tid e L ow tid e H ig h tid e L ow tid e H ig h tid e L ow tid e H ig h tid e L ow tid e H ig h tid e L ow tid e H ig h tid e L ow tid e Pr e- m on so on Fe br ua ry M ar ch A pr il M ay A ve ra ge M on so on Ju ne Ju ly A ug us t Se pt em be r A ve ra ge Po st -m on so on O ct ob er N ov em be r D ec em be r Ja nu ar y A ve ra ge 1.7 Processing of Data 31 Table 1.10 Simple table (Population size of some selected countries, 2011) Name of the country Total population (million) China 1360 India 1210 USA 304 Indonesia 229 Brazil 193 Pakistan 165 Source Human Development Report 2011, Oxford University Press, New Delhi. Example The reports in tabular form prepared by different governmental and government-aided offices are the examples of general purpose table. Special Purpose Table Special purpose table or text table or summary table contains the summary of infor- mation and is used for special purposes. Generally, the table is small in size and prepared from the information gathered in the reference table. These tables are prepared suddenly. Example Table prepared with the data collected about the smokers in India is a special purpose table. Again, on the basis of nature and characteristics of classification of data, tables are of two types: Simple Table A simple table contains the data representing one characteristic only; information relating to other characteristics is left out (Table 1.10). Complex Table Complex table contains the data representing several characteristics. It shows the figures corresponding to a number of items (Table 1.11). 32 1 Concept, Types, Collection, Classification … Table 1.11 Complex table (Hypothetical state of the Earth’s atmosphere) Altitude (m) Hypothetical state of the Earth’s atmosphere Pressure (MPa) Temperature (°C) Density (kg/m3) 0 0.1013 15.0 1.225 1000 0.898 8.5 1.1117 2000 0.759 2.0 1.0581 3000 0.701 –4.5 0.9093 1.7.3 Frequency Distribution It is the method by which all the observations of a series are divided into a number of classes or groups, and the corresponding number of observations under each class areshown against its respective class. The number of times each value occurs is called frequency, and the table in which the distribution of observations against each variable or class of variables is shown is known as frequency distribution table or frequency table. A frequency distribution table contains a condensed summary of the original data. It represents not only the range of the values of a data series, but also shows the nature of their distribution throughout the range of the series. The summarization of data into frequency distribution table entails much loss of details of the data. But it is very helpful and is an effective way for the treatment and interpretation of large volume of data. Some values, like different central values, values of dispersion and variability etc. of the data may be calculated easily from the frequency distribution table. The raw data (unorganized data having no form and structure) have no value in statistical analysis and interpretation. Array (arranged data in order of magnitude either in descending or in ascending order) has little significance in statistical analysis and interpretation. Frequency distribution of a large volume of information is very useful and significant in statistical analysis and interpretation. On the basis of their nature, frequency distributions are of two types: Simple (ungrouped) frequency distribution In this type, the observations are not divided into groups or classes, the values of variables are shown individually (Table 1.12). Grouped frequency distribution In this type, the observations are divided into different classes or groups and the number observations in each class are shown as frequency (Table 1.13). 1.7.3.1 Important Terminologies Associated with Grouped Frequency Distribution In grouped frequency distribution, the following terms are very useful and significant: (a) Class or class interval 1.7 Processing of Data 33 Table 1.12 Simple frequency distribution Amount of rainfall (mm) Frequency (Number of rainy days) 60 12 61 10 62 8 63 15 64 11 65 7 66 5 Table 1.13 Grouped frequency distribution Class interval (Temperature in °C) Frequency (Number of days) 11–15 37 16–20 31 21–25 43 26–30 19 31–35 9 36–40 6 41–45 5 (b) Class limit (lower class limit and upper class limit) (c) Class boundary (lower class boundary and upper class boundary) (d) Class frequency ( f i ) and Total frequency (N) (e) Class mark or mid-value or mid-point of class interval (xi ) (f) Class width or size of class interval (wi ) (g) Frequency density ( f di ) (h) Relative frequency (R f i ) (i) Percentage frequency (a) Class or class interval: Large number of observations having wide range is usually classified into several groups according to the size of values. These groups are called class interval or simply classes. In Table 1.14, column (1), the class interval of temperatures (in °C) are 11–15, 16–20 etc. There are seven classes in the frequency distribution, the last class being 41–45. Two ends of the classes are defined by class limits or boundaries. When two ends of a class are clearly specified, then it is called closed-end class but the class, in which one end is not clearly specified, is called an open-end class. When relatively few observations are far apart from the rest, then the construction of open-ended classes is required. Classes having no or zero frequency are called empty classes. (b) Class limit (Lower class limit and Upper class limit): In case of grouped frequency distribution, the classes or class intervals, specified by pairs of values are arranged in such a way that the upper end (upper value) of one class does 34 1 Concept, Types, Collection, Classification … Ta bl e 1. 14 Fr eq ue nc y di st ri bu tio n ta bl e (B as ed on th e da ta fr om Ta bl e 1. 13 ) C la ss In te rv al [1 ] C la ss Fr eq ue nc y (f i) [2 ] C la ss L im it C la ss B ou nd ar y C la ss M ar k (x i) [7 ] W id th of C la ss (w i) [8 ] Fr eq ue nc y D en si ty (f d i) [9 ] R el at iv e fr eq ue nc y (R f i ) [1 0] Pe rc en ta ge fr eq ue nc y [1 1] L ow er [3 ] U pp er [4 ] L ow er [5 ] U pp er [6 ] 11 –1 5 37 11 15 10 .5 15 .5 13 5 7. 4 0. 24 7 24 .7 16 –2 0 31 16 20 15 .5 20 .5 18 5 6. 2 0. 20 7 20 .7 21 –2 5 43 21 25 20 .5 25 .5 23 5 8. 6 0. 28 6 28 .6 26 –3 0 19 26 30 25 .5 30 .5 28 5 3. 8 0. 12 7 12 .7 31 –3 5 9 31 35 30 .5 35 .5 33 5 1. 8 0. 06 6 36 –4 0 6 36 40 35 .5 40 .5 38 5 1. 2 0. 04 4 41 –4 5 5 41 45 40 .5 45 .5 43 5 1 0. 03 3 3. 3 To ta l N = ∑ f i 15 0 1. 00 10 0 1.7 Processing of Data 35 not coincide with the lower end (lower value) of the immediately following class. These two extreme values, used to specify the limits of a class for the purpose of tallying the original observations into different classes are known as ‘Class Limits’. The smaller value (lower end value) of the pair is called lower class limit, whereas the larger value (upper end value) is called upper class limit of a particular class (Sarkar 2015). In Table 1.14, the values 11, 16, 21, 26, 31, 36 and 41 (in column 3) are lower class limits, while the values 15, 20, 25, 30, 35, 40 and 45 (in column 4) are the upper class limits. (c) Class boundary (Lower class boundary and Upper class boundary): Class boundaries are the limits up to which the two limits of each class may be extended to fill up the gapwhich exists between classes (Bose 1980). The upper boundary of one class coincides with the lower boundary of the immediately following class. The lower extreme value of the two boundaries is called the lower class boundary and the upper extreme value of the same is called the upper class boundary (columns 5 and 6 in Table 1.14). Class boundary is calculated from class limit using the following formula: Lower Class Boundary = [ Lower Class Limit − ( d 2 )] (1.1) Upper Class Boundary = [ Upper Class Limit + ( d 2 )] (1.2) where ‘d’ is the common difference between the upper class limit of any class (class interval) and the lower class limit of the next class (class interval). The obser- vations are recorded to the nearest unit, d = 1 or the nearest tenth of a unit, d = 0.1 etc. (d) Class frequency and Total frequency: Class frequency or simply Frequency is the number of observations (values) lying within a class. It is denoted by f i . Total frequency (N) is the sum of all the class frequencies in a distribution. In other words, if all the class frequencies in a distribution are summed up, it indicates the total frequency. Total frequency shows the total number of observations considered in the frequency distribution. In Table 1.14, the class frequencies are 37, 31, 43,…5 (column 2) and the total frequency is 150. The working formula of total frequency (N) is as follows: N = n∑ i=1 fi (1.3) where n = number of class f i= frequency of the ith class. (e) Class mark or mid-value or mid-point of class interval: The value lying exactly at the middle of a class interval is called class mark or mid-value (13, 18, 23…43 are the class mark in column 7 in Table 1.14). The working formula 36 1 Concept, Types, Collection, Classification … for class mark is as follows: Class Mark (xi ) = Lower Class Limit +Upper Class Limit 2 (1.4) Or,Class Mark (xi ) = Lower Class Boundary +Upper Class Boundary 2 (1.5) The mid-value of the class or the class mark is considered as the representa- tive value of the class for the computation of descriptive statistics like mean, mean deviation, standard deviation etc. (f) Class width or size of class interval: The difference between the lower and upper class boundaries (but not class limits) is called class width or size of class interval. In Table 1.14, 5 is the class width (column 8) of this frequency distribution. Width of class(wi ) = [Upper class boundary − Lower class boundary] (1.6) Generally, in a frequency distribution, equal width of the classes is preferred as it simplifies thecalculation of some statistical measures (mean, median, mode, mean deviation, standard deviation etc.) in short-cut method. But in few cases, classes of unequal size may also be constructed when the values are highly dispersed in nature and some of them are few and far away from the rest. In such cases, the use of equal width may result in some ‘Empty classes’, i.e. classes with zero frequency. (g) Frequency density: Number of frequency per unit class width is called the frequency density of a class. More the number of frequency per unit class width, more the frequency density and vice versa. The degree of concentration of frequency in a particular class is represented by the frequency density and is calculated by the following formula: Frequency densi ty( fdi ) = Class Frequency Class width = fi wi (1.7) In Table 1.14, the frequency densities of different classes are 7.4, 6.2, 8.6 etc. (Column 9). Frequency density is used for the drawing of histogram in case of frequency distribution having unequal class width. (h) Relative frequency: In a frequency distribution, the ratio between frequency of a particular class ( fi ) and the total frequency (N ) of the distribution is called relative frequency (R f i ). The sum of all relative frequency in a distribution is equal to unity (1). Relative Frequency (R fi ) = ( fi N ) (1.8) 1.7 Processing of Data 37 and n∑ i=1 R fi = 1 (1.9) where n = number of classes. In Table 1.14, the relative frequencies of different classes are 0.247, 0.207, 0.286 etc. (Column 10). (i) Percentage frequency: Percentage frequency is the class frequency when expressed as a percentage of the total frequency. In other words, when relative frequency is expressed in terms of percentage, then it is called as percentage frequency. Percentage f requency = Class f requency T otal f requency × 100 (1.10) In Table 1.14, the percentage frequencies of different classes are 24.7, 20.7, 28.6 etc. (Column 11). The sum of all percentage frequencies in a distribution is equal to hundred percentage (100%). 1.7.3.2 Construction of Frequency Distribution Table In grouped frequency distribution, themain questions are: (a) selection of the number of classes, (b) selection of class width and (c) selection of class limits and boundaries. (a) Selection of the number of classes: There is no hard and fast rule in selecting the number of classes into which the observations would be divided. Generally, it depends on the nature of the data, number of observations in the series and the purpose for which the data are intended. It is generally agreed that the number of classes should neither be very large (to avoid lengthy and unwieldy frequency distribution) nor very small (information will be lost and the true pattern of the distribution of observations will be obscured). Normally, the number of classes should lie between 5 and 15, depending on the number of observations available. In case of small number of observations, some authors suggest the use of Sturges’ formula: n = 1 + 3.3 logN (1.11) where, n is the number of classes and N is the total number of observations in the data series. (b) Selection of class width: Selection of the class width depends on the number of observations in the data series and the number of classes into which the observations are divided. For this purpose, at first we have to calculate the 38 1 Concept, Types, Collection, Classification … range (difference between highest and lowest value of the observations) of the data. If we like to have classes of equal width, then the width of the classes can be obtained by the formula: Class width = Range(Highest value of the series − Lowest value of the series) Number of classes (1.12) Or, Class width = Range(Highest value of the series − Lowest value of the series) 1 + 3.3 log N (1.13) Similarly, if the class width is known, the number of classes of the frequency distribution can be calculated by: Number of classes = Range(Highest value of the series − Lowest value of the series) Class width (1.14) Example If the maximum and minimum values in a data series are 865 and 105, respectively, then the range of data will be 887–105 = 782. In case of 8 number of classes, the width of classes (wi ) will be 782 8 = 97.75. It is very important to note that if the range of the data set is approximated to its nearest round figure which can easily be divided by the number of classes (n), the width of the class becomes easily recognizable andmore comprehensive. In case of the above example,maximumvalue of 887 can be considered as 900 and the minimum value of 105 can be considered as 100. Thus, the range of the data becomes 900–100 = 800 and the class width for 8 numbers of classes will be 900−100 8 = 800 8 = 100. So, the class width of 97.75 can easily be modified to 100 for practical applications. Consideration of class width is very significant because it is an important deter- minant for the selection of class limits, class boundaries and class mark which are essentially used not only in preparing the frequency distribution table but also in the computations of different descriptive statistical measures (Sarkar 2015). (c) Selection of class limits and boundaries: Selection of class limit is made in two ways. (i) Exclusive method (ii) Inclusive method (i) Exclusive method: In this method, the upper limit of one class coincides with the lower limit of the next class, i.e. the upper limit of one class and the lower limit of the following class have the same figure and the same value. For example, 25–35, 35–45, 45–55 etc. (Table 1.15). In this situation, the problem arises on account that in which class a value identical to the coinciding limits would be included. The problem is solved by excluding the identical value from the previous class and including it in the following class. In this sense, the upper limit of each class is considered as less than that limit while the lower limit of each class represents the exact value. Then, the class intervals of the above example will be stated as 25 to less than 35, 35 to less than 45, 45 to less than 55 etc. 1.7 Processing of Data 39 Table 1.15 Exclusive and inclusive methods of selection of class limit Exclusive method Inclusive method Class limit (Weight in kg) Frequency (Number of persons) Class limit (Number of workers) Frequency (Number of factory) 25–35 6 25–34 5 35–45 4 35–44 5 45–55 8 45–54 7 55–65 3 55–64 3 65–75 4 65–74 5 (ii) Inclusive method: In this method, the lower limit and upper limit of a partic- ular class are included within the same class. Thus, the upper limit of one class does not coincide with the lower limit of the following class. Due to this, a gap exists between the upper limit of one class and the lower limit of the following class. For example, 25–34, 35–44, 45–54 etc. (Table 1.15). Thismethod of classificationmay be applied for the grouped frequency distribution of discrete variables, such as number of family members, number of households, number of industrial workers etc., which can occur in integral values only. This method is not suitable to use in variables with fractional values like temperature, weight, height etc. Thus, the nature and characteristics of the variable (continuous or discrete) under observation is important to decide whether the exclusive method or the inclusive method should be used for the selection of class limits. Exclusive method must be used for the classification of continuous variables whereas the inclusive method is suitable in case of discrete variables. A frequency distribution table is drawn with (n + 1) rows and 6 columns (for equal class width) or 7 columns (for unequal class width). The column heads, from left to right are: class limits, class boundaries, class mark (xi ), class width (wi ), tally marks and frequency ( f i ). In case of unequal classes, an additional column with headings of frequency density ( fi wi ) is drawn. Example-1 Heights (in metre) of 35 places frommean sea level are given below. Prepare a frequency distribution table from the given data. 412 350 307 308 432 342 357 297 328 375 356 429 329 240. 353 403 355 404 350 335 304 332 281 335 361 266 324 302. 406 366 337 345 343 227 364. Solution: Number of classes (n) = 1 + 3.3 log N. = 1 + 3.3 log 35 [N = number of observations]. = 6.1294. = 6 (nearest round figure). Class width (w) = Range (Highest value of the series−Lowest value of the series) Number of classes = 432m−227m 6= 34.17 m. = 35 m. 40 1 Concept, Types, Collection, Classification … Table 1.16 Frequency distribution table showing the height (in metre) from mean sea level Class limit (Height in m) Class Boundary (Height in m) Class mark (xi ) Class width (wi ) Tally marks Frequency ( fi ) 227–261 226.5–261.5 244 35 2 262–296 261.5–296.5 279 35 2 297–331 296.5–331.5 314 35 8 332–366 331.5–366.5 349 35 16 367–401 366.5–401.5 384 35 1 402–436 401.5–436.5 419 35 6 Total N = ∑ f i = 35 Example-2 Meanmonthly temperature (in °F) of 40 places are given below. Prepare a frequency distribution table from the given data. 29.4 49.5 39.6 45.7 53.8 39.7 36.6 58.7 34.4 39.7 54.4 54.7 51.5 62.5 23.0 80.8 30.3 27.7 44.0 35.7 56.1 60.2 72.2 50.8 33.4 56.3 32.4 59.2 48.2 45.1 42.7 52.1 24.2 68.6 66.0 39.5 43.4 36.4 44.1 56.6 Solution: Number of classes (n) = 1 + 3.3 log N. = 1 + 3.3 log 40 [N = number of observations]. = 6.32. = 6 (nearest round figure). Class width(w) = Range (Highest value of the series − Lowest value of the series) Number of classes = 80.8 0F − 23.0 0F 6 = 9.63 0F = 10 0F 1.7 Processing of Data 41 Table 1.17 Frequency distribution table showing the mean monthly temperature (°F) Class limit (Temperature in °F) Class Boundary (Temperature in °F) Class mark (xi ) Class width (wi ) Tally marks Frequency ( fi ) 23.0–32.9 22.95–32.95 27.95 10 6 33.0–42.9 32.95–42.95 37.95 10 10 43.0–52.9 42.95–52.95 47.95 10 10 53.0–62.9 52.95–62.95 57.95 10 10 63.0–72.9 62.95–72.95 67.95 10 3 73.0–82.9 72.95–82.95 77.95 10 1 Total N = ∑ fi = 40 1.7.3.3 Cumulative Frequency Distribution The accumulated frequency upto or above some value of the variable is known as ‘Cumulative frequency’. Cumulative frequency corresponding to a particular value of the variable can be defined as the number of observations smaller than or greater than that value (Das 2009). A cumulative frequency distribution is a form of frequency distribution in which the cumulative frequency upto each class is shown against the same class (Bose 1980). Cumulative frequency of any class is calculated by adding the frequency of each class to the total frequency of the previous classes. It represents the progressive total of the frequencies falling under each class. A cumulative frequency distribution can be formed in two ways: (i) by less than method and (ii) by more than method. The number of observations ‘upto’ a given value is called less than cumulative frequency and the number of observations ‘greater than’ a value is called more than cumulative frequency. In the less than method, the frequencies are accumulated from the lowest class to upwards, but in more than method, the frequencies are accumulated from the highest class to downwards. 1.7.3.4 Uses of Cumulative Frequency Distribution Cumulative frequency distribution is very significant and useful to determine the number of observations less than or greater than a particular value. It is very helpful in finding (a) the number of observations less than or below any given value (b) the number of observations more than or above any given value (c) the number of observations falling between two specific values. 42 1 Concept, Types, Collection, Classification … Table 1.18 Cumulative frequency distribution table using the data of Table 1.16 Class limit (Height in m) Class Boundary (Height in m) Frequency ( f i ) Cumulative Frequency (F) Less than F More than F 227–261 226.5–261.5 2 226.5 0 226.5 35 (0 + 6 + 1 + 16 + 8 + 2 + 2) 262–296 261.5–296.5 2 261.5 2 (0 + 2) 261.5 33 (0 + 6 + 1 + 16 + 8 + 2) 297–331 296.5–331.5 8 296.5 4 (0 + 2 + 2) 296.5 31 (0 + 6 + 1 + 16 + 8) 332–366 331.5–366.5 16 331.5 12 (0 + 2 + 2 + 8) 331.5 23 (0 + 6 + 1 + 16) 367–401 366.5–401.5 1 366.5 28 (0 + 2 + 2 + 8 + 16) 366.5 7 (0 + 6 + 1) 402–436 401.5–436.5 6 401.5 29 (0 + 2 + 2 + 8 + 16 + 1) 401.5 6 (0 + 6) N = ∑ f i = 35 436.5 35 (0 + 2 + 2 + 8 + 16 + 1 + 6) 436.5 0 Cumulative frequency may be represented in relative or percentage form.When it is represented in percentage, it is known as cumulative percentage. It is very helpful for the comparison between frequencies. Example-1 See Table 1.18. Example-2 See Table 1.19. 1.8 Methods of Presentation of Geographical Data Presentation of data means the demonstration of the data in an attractive and lucid manner tomake them easily understandable to all. Suitable and accurate visualization of the collected data becomes helpful for their proper understanding, analysis and explanation. Geographical data can be represented and portrayed in the following four ways: 1.8 Methods of Presentation of Geographical Data 43 Table 1.19 Cumulative frequency distribution table using the data of Table 1.17 Class limit (Temperature in °F) Class Boundary (Temperature in °F) Frequency ( fi ) Cumulative Frequency (F) Less than F More than F 23.0–32.9 22.95–32.95 6 22.95 0 22.95 40 (0 + 1 + 3 + 10 + 10 + 10 + 6) 33.0–42.9 32.95–42.95 10 32.95 6 (0 + 6) 32.95 34 (0 + 1 + 3 + 10 + 10 + 10) 43.0–52.9 42.95–52.95 10 42.95 16 (0 + 6 + 10) 42.95 24 (0 + 1 + 3 + 10 + 10) 53.0–62.9 52.95–62.95 10 52.95 26 (0 + 6 + 10 + 10) 52.95 14 (0 + 1 + 3 + 10) 63.0–72.9 62.95–72.95 3 62.95 36 (0 + 6 + 10 + 10 + 10) 62.95 4 (0 + 1 + 3) 73.0–82.9 72.95–82.95 1 72.95 39 (0 + 6 + 10 + 10 + 10 + 3) 72.95 1 (0 + 1) N = ∑ fi = 40 82.95 40 (0 + 6 + 10 + 10 + 10 + 3 + 1) 82.95 0 1.8.1 Textual Form Textual presentation is themost raw and vague formof representation of geographical data. In textual form, data are presented in paragraph or in sentences. When the amount of data is not too large, then this form of presentation is more appropriate and effective. In textual presentation,mainly the important characteristics are enumerated giving emphasis on the most significant figures and highlighting the most striking attributes of the data set. Significant figures and attributes may be the summary statistics likemaximumandminimumvalue,mean,median,meandeviation, standard deviation etc. Example Out of 180 sediment samples studied in Rupnarayan River, approximately, 63.80% of the sediments are very fine sand, 14.76% are fine sand and 21.44% are coarse silt type. In dry season, more than 60% sediments are moderately to well sorted but in monsoon season 63.85% sediments are poorly to very poorly sorted. 44 1 Concept, Types, Collection, Classification … Around 55% of the sediments are of fine and very fine skewed type, 33% of samples are near symmetrical and remaining 12% are of coarse skewed type. 1.8.1.1 Advantages and Disadvantages of Textual Form Advantages 1. Easy to understand. 2. It enables one to give emphasis on certain important features of the data presented. Disadvantages 1. One has to go through the complete reading of the text for comprehension. 2. Boring to read especially if too lengthy. 3. Reader may skip the statements. 1.8.2 Tabular Form Tabular presentation of geographical data is very important and easily understand- able to all. It is one of the most commonly used forms of representation of data as tables are very easy to construct and understand. A table makes possible repre- sentation of even large amounts of data in a lucid, attractive and organized manner. Tabulation is the orderly and systematic arrangement of numerical data presented in columns and rows in order to extract information. It summarizes the data in a logical and orderly manner for the reasons of presentation, comparison and interpretation and makes the data brief and concise as theycontain only the relevant figures (Table 1.20) [Detailed discussion in Sect. 1.7.2]. Table 1.20 Tabular presentation of data (% of sand, silt and clay in bed sediments of Rupnarayan River) Locations Sand-silt-mud proportion (%) Pre-monsoon season Monsoon season Post-monsoon season Sand Silt Clay Sand Silt Clay Sand Silt Clay Kolaghat 68–91 8–30 1–15 76–91 8–17 1–12 71–86 9–20 5–20 Soyadighi 60–78 12–25 8–18 70–86 8–21 2–18 70–84 7–21 8–18 Anantapur 45–78 13–48 6–38 59–86 10–42 4–17 55–78 14–39 4–22 Pyratungi 54–79 8–32 12–26 73–87 9–20 4–15 56–85 4–17 9–18 Dhanipur 38–76 11–61 1–41 52–84 10–45 3–18 45–75 24–54 1–25 Geonkhali 46–78 9–40 12–38 61–87 9–18 4–19 49–74 10–32 16–21 Source Field survey and laboratory experiment. 1.8 Methods of Presentation of Geographical Data 45 1.8.2.1 Advantages and Disadvantages of Data Representation in Table Advantages The advantages of tabulation of data are as follows: 1. By tabulation, data are arranged systematically and logically in concise form. 2. Tabulation enables the data to be easily understandable and it is more impressive than textual presentation. 3. It is very useful to detect the errors and exclusions in the data. 4. Recurrence of explanatory terms and phrases can be avoided. 5. The nature and characteristics of data can easily be understood at a glance in tabular form. 6. Comparison and interpretation of statistical data becomes easy. Disadvantages 1. Tabular presentation does not give a detailed view of the data, unlike textual (descriptive) presentation. 2. It is only helpful to identify the differences of points or if we want to tally two or more things. 1.8.3 Semi-Tabular Form It is the combination of textual and tabular form of data presentation. This is also called partial-tabular presentation of data. It is helpful for the easy comparison because the numerical figures are separately presented from the text. Example Overall literacy rates in different census years after independence in India are: • 16.67% in 1951 • 24.02% in 1961 • 29.45% in 1971 • 36.23% in 1981 • 42.84% in 1991 • 54.51% in 2001 • 64.32% in 2011. 1.8.4 Graphical Form (Graphs, Diagrams and Maps) In addition to all the above mentioned methods, classified and tabulated geograph- ical data can suitably and easily be represented through different graphs (line graph, climograph, Lorenz curve, rank-size graph, frequency graph etc.), diagrams (bar diagram, pie-diagram, rectangular diagram etc.) and maps (choropleth map, 46 1 Concept, Types, Collection, Classification … chorochromatic map, choroschematic map etc.). All the graphs, diagrams and maps are drawn following various geometric methods, thus it is known as geometric repre- sentation of data. Representation of geographical data by graphical, diagrammatic and mapping techniques is very popular, attractive and easy to understand to the geographers, researchers and to the common literate people also. References Bose A (1980) Statistics. Calcutta Book House, 1/1 Bankim Chatterjee Street, Calcutta 700073 Connor LR (1937) Statistics in theory & practice, 2nd edn, Sir Isaac Pitman & Sons, Inc Das NG (2009) Statistical methods, vol. I & II. McGraw Hill Education (India) Pvt Ltd, ISBN: 978-0-07-008327-1 Galtung J (1968) A structural theory of integration. J Peace Res 5(4):375–395 Gregory H, Ward D (1967) Statistics for business studies. McGraw-Hill. ISBN:9780070944909 Kapur SK (1995) Elements of practical statistics. Oxford & IBH Publishing Co Pvt Ltd., NewDelhi Khan MAT (2006) Quantitative techniques in geography. Perfect Publications, Dhaka. ISBN: 984- 8642-02-1 Pal SK (1998) Statistics for Geoscientists: Techniques and Applications. Concept Publishing Company, New Delhi. ISBN: 81-7022-712-1 Sarkar A (2015) Practical geography: a systematic approach. Orient Blackswan Private Limited, Hyderabad, Telengana, India. ISBN: 978-81-250-5903-5 Young PV (1994) Scientific social surveys and research. Prentice Hall of India Private Limited, New Delhi Chapter 2 Representation of Geographical Data Using Graphs Abstract Suitable, accurate and lucid presentation and visualization of geographical data using various types of graphs become helpful for their correct analysis, expla- nation and realization for proper understanding of the real world. It is very simple, attractive and easily recognizable not only to the geographers or efficient academi- cians but also to the common literate people. This chapter includes a detailed classifi- cation of all types of graphs and the discussion of various types of co-ordinate systems with illustrations as an essential basis of the construction of graphs. Different types of bi-axial (arithmetic and logarithmic graph, climograph etc.), tri-axial (ternary graph), multi-axial (spider graph, polar graph etc.) and special graphs (water budget graph, hydrograph, rating curve,Lorenz curve, rank-size graph, hypsometric curve etc.) have been discussed with suitable examples in terms of their suitable data structure, neces- sary numerical calculations, methods of construction, appropriate illustrations, and advantages and disadvantages of their use. Systematic and step-by-step discussion of methods of their construction helps the readers for easy and quick understanding of the graphs. The difference between arithmetic and logarithmic graphs is explained precisely with proper examples and illustrations. Different types of frequency distri- bution graphs have been explained with suitable data, necessary mathematical and statistical computations, and proper illustrations. All types of graphs represent a perfect co-relation between the theoretical knowledge of various geographical events and phenomena and their realistic implications with suitable examples. Keywords Graphs · Co-ordinate system · Bi-axial graph · Arithmetic and logarithmic graph · Tri-axial graph ·Multi-axial graph · Special graph · Frequency distribution graph 2.1 Concept of Graph The method of representation of geographical data in graphs has been developed to avoid the difficulties arising from their tabular presentation and for their better under- standing also. This technique is very helpful to understand and explain the relation- ship between various geographic data, to indicate the trends of different geographic variables and to make a comparison between them. Graph is the most familiar and © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Maity, Essential Graphical Techniques in Geography, Advances in Geographical and Environmental Sciences, https://doi.org/10.1007/978-981-16-6585-1_2 47 http://crossmark.crossref.org/dialog/?doi=10.1007/978-981-16-6585-1_2&domain=pdf https://doi.org/10.1007/978-981-16-6585-1_2 48 2 Representation of Geographical Data Using Graphs conventional method inwhich a series of geographical data are represented following a suitable co-ordinate system on a reference frame. 2.2 Types of Co-ordinate System The four main types of co-ordinate systems are: (1) Cartesian or rectangular co-ordinate system (2) Polar co-ordinate system (3) Cylindrical co-ordinate system (4) Spherical co-ordinate system 2.2.1 Cartesian or Rectangular Co-ordinate System Cartesian co-ordinate or rectangular co-ordinate system is a co-ordinate system that identifies each point distinctively on a plane with the help of a set of numerical co- ordinates. Co-ordinates are the signed (either positive or negative) distances to the specific point from two perpendicular oriented lines. Both the co-ordinates and lines are measured and represented in the same unit of length. This co-ordinate system provides a technique of portraying graphs and representing the positions of points on a two-dimensional (2D) surface as well as in a three-dimensional (3D) space. On a two-dimensional (2D) surface,while constructing a graph, at first a horizontal line X X ′ called abscissa and a vertical line Y Y ′ called ordinate (both are known as co-ordinate axes) are drawn which intersecteach other at right angles and the whole plotting area is divided into four parts called quadrants (Fig. 2.1). The point of intersection of these two axes is called point of origin (‘O’) or zero-point having x–y co-ordinate (0, 0). All the distances along these two axes are always measured from this zero point. The rightward and upward measurement of distances from the zero point indicates the positive values, whereas the leftward and downward measurements represent the negative values.Thevalues of both ‘x’ and ‘y’ are positive in the first quadrant, but in the second quadrant, the values of ‘x’ and ‘y’ are negative and positive, respectively. In the third quadrant, both ‘x’ and ‘y’ are negative, while in the fourth quadrant, ‘x’ value is positive but ‘y’ value is negative (Fig. 2.1). Most of the geographical data collected from field investigations are positive in nature and hence these data are represented in the first quadrant in which the values of ‘x’ and ‘y’ both are positive. The use of second, third and fourth quadrants are comparatively less in statistical and geographical analysis. In Cartesian three-dimensional (3D) space, another important axis oriented at right angles to the xy plane is added. This axis passes through the origin of 2.2 Types of Co-ordinate System 49 Fig. 2.1 Position of independent and dependent variables in different quadrants (Cartesian co- ordinate system) the xy plane and is called the ‘Z’ axis, representing the height. Positions or co- ordinates of points (‘P’, ‘Q’, ‘R’, ‘S’ in Fig. 2.2) are determined based on the east– west (x), north–south (y) and up–down (z) displacements of points from the origin ‘O’ (0, 0, 0). 2.2.2 Polar Co-ordinate System Polar co-ordinate system is another common and important co-ordinate system for the plane. This two-dimensional co-ordinate system specifies the location of each point on a plane by the measurement of the distance from a reference point (called pole) and an angle from a reference direction (called polar axis). The location of the point is obtained by measuring the signed distance from the origin (pole) and the given angle (measured counter-clockwise) from the polar axis. For a given distance from the origin ‘r’ and angle from polar axis ‘θ ’, the co-ordinate of any point (‘P’ in Fig. 2.3) is (r, θ). The pole is characterized by (0, θ) for any value of θ . Polar co-ordinate system is extended to three dimensions in two methods like cylindrical co-ordinate system and spherical co-ordinate system. 50 2 Representation of Geographical Data Using Graphs Fig. 2.2 Determination of location of a point on Cartesian co-ordinate system (3D) Fig. 2.3 Determination of location of a point on polar co-ordinate system The relation between Cartesian co-ordinate (x, y) and polar co-ordinate (r, θ ) is that (Fig. 2.3) sin θ = y r (sine function for y) (2.1) 2.2 Types of Co-ordinate System 51 where y = r sin θ and cos θ = x r (cosine function for x) (2.2) where x = r cos θ . Again, r = √ x2 + y2(Pythagoras theorem to find the long side, i · e. the hypotenuse) (2.3) and θ = tan−1 y x (tangent function to find angle) (2.4) Conversion from Cartesian to polar co-ordinate system Example: What is (4, 3) in polar co-ordinates? Solution Using Pythagoras theorem (Eq. 2.3) we have r = √ x2 + y2 r = √ 42 + 32 r = √ 16 + 9 r = √ 25 r = 5 Using tangent function (Eq. 2.4) we have θ = tan−1 y x θ = tan−1 3 4 θ = tan−10.75 θ = 36.87◦ Answer: The point (4, 3) is (5, 36.87◦) in polar co-ordinates. 52 2 Representation of Geographical Data Using Graphs Conversion from polar to Cartesian co-ordinate system Example: What is (5, 36.87◦) in Cartesian co-ordinates? Solution Using the sine function (Eq. 2.1) we have sin θ = y r sin θ = y 5 y = 5 × sin 36.87◦ y = 5 × 0.6 y = 3 Using the cosine function (Eq. 2.2) we have cos θ = x r cos θ = x 5 x = 5 × cos 36.87◦ x = 5 × 0.8 x = 4 Answer: The point (5, 36.87◦) is (4, 3) in Cartesian co-ordinates. 2.2.3 Cylindrical Co-ordinate System Cylindrical co-ordinate system is the extension of the polar co-ordinates by adding the z-axis along with the height of a right circular cylinder. The z-axis in this co- ordinate system is the same as in Cartesian co-ordinate system (3D). The addition of z-axis in polar co-ordinate system gives a triple (r, θ, z) (Fig. 2.4). In some texts, ρ is used in place of r to denote the distance from the origin to the foot of the perpendicular to avoid confusion. In terms of Cartesian co-ordinate system 2.2 Types of Co-ordinate System 53 Fig. 2.4 Determination of location of a point on cylindrical co-ordinate system x = ρ cosθ (2.5) y = ρ sinθ (2.6) z = z (height) Again, in inverse relation these become ρ = √ x2 + y2 (2.7) θ = tan−1 y x (2.8) z = z (height) 54 2 Representation of Geographical Data Using Graphs 2.2.4 Spherical Co-ordinate System Spherical coordinate (also known as spherical polar coordinate) system is a curvi- linear coordinate system in which positions of points are defined on a sphere or spheroid. In spherical co-ordinate, the value of z co-ordinate is converted into φ giving a triple (r, θ, φ). Here, r is the distance of a point (say P in Fig. 2.5a, b) from the origin (radial distance) and θ (azimuthal angle) is the angle between the x-axis and the line joining the origin to P ′, the foot of the perpendicular from the point P (Fig. 2.5a, b) in the x–y plane. The angle θ is complementary to the longi- tude [0 ≤ θ < 2π ] and is denoted as λ when referred to as longitude. The angle φ (polar angle, zenith angle or colatitude) is the angle made by the radius vector (the vector which connects the point P with origin) with respect to the z-axis. It is complementary to the latitude [0 ≤ φ ≤ π ] and is represented as φ = 90◦ − δ, where δ is the latitude. Conventionally, (r, θ, φ) is used in mathematics to represent radial distance, azimuthal angle and polar angle, respectively. Sometimes, especially in physics θ and φ are reversely used, i.e. θ indicates polar or zenith angle and φ indicates azimuthal angle. Then, (r, θ, φ) represent radial distance, polar angle and azimuthal angle, respectively. In spherical co-ordinate, the symbol ρ is frequently used instead of r to avoid the confusion with the value r in 2D polar coordinate systems, i.e. then (r, θ, φ) becomes (ρ, θ, φ). The transformation from spherical co-ordinates (r, θ, φ) [radial, azimuthal, polar] to Cartesian co-ordinates (x, y, z) is given by x = r cosθ sinφ (2.9) y = r sinθ sinφ (2.10) Fig. 2.5 Determination of location of a point on spherical co-ordinate system 2.2 Types of Co-ordinate System 55 z = r cosφ (2.11) Again, the inverse relation indicates r = √ x2 + y2 + z2 (2.12) θ = tan−1 ( y x ) (2.13) φ = tan−1 (√ x2 + y2 z ) = cos−1 ( z r ) (2.14) 2.3 Selection of Scale in Constructing a Graph While constructing a graph, the selection of scale should be done carefully keeping two things in mind: (i) the nature and range of the entire data set and (ii) the size of the graph paper. Conventionally, the independent variable is shown along the x-axis (abscissa) while the dependent variable is shown along the y-axis (ordinate). It is not mandatory to make the scale of the x-axis and y-axis identical. In time series data, the scale of the x-axis starts from the lowest value of the given variable or the starting time of the time series, whereas the scale of the y-axis starts from the value zero (0). But in the case of frequency distribution, the scale along the y-axis starts from zero while the scale along the x-axis may start from zero or with the value one point before the lowest value of the measured variable (Saksena 1981). After selecting the suitable scale, all the points are plotted on the graph paper and then the obtained points are joined by straight lines but it is not mandatory. Though several types of graphs are there, the selection of suitable graph mainly depends on the type and nature of data and the objectiveof the study or research. In geographical research, the collected, classified, tabulated and summarized data are represented graphically to make them easily understandable and comprehensive. As most of the graphical representation of geographical data is done by geometrical methods, thus it is also known as the geometrical representation of data. 2.4 Advantages and Disadvantages of the Use of Graphs Graphical representation of various geographical data possesses some advantages as well as some disadvantages: Advantages 56 2 Representation of Geographical Data Using Graphs 1. Graphical representation is more attractive and appealing to the eyes, which leaves an enduring impression on the mind and is thus easily understandable to all. 2. The trend and tendency of the values of geographical variables (time series data) can be easily understood. 3. It is very effective and useful to understand the nature and characteristics of complex geographical data sets. 4. It helps in making a comparison of two or more sets of geographical data. 5. The relationship between several sets of geographical variables can be effec- tively shown by this method. 6. Any type of inaccuracy and error in geographical data becomes perceptible by their graphical representation. 7. It is useful for the interpolation of values of geographical variables. 8. Median, mode, quartile and other descriptive statistics can be easily calculated and estimated by graphical representation of geographical data. Disadvantages 1. Overall and detailed information of geographical data cannot be obtained from their graphical representation. 2. It reveals only the approximate position; it seldom reflects the perfect values. 3. It is time-consuming to prepare the graph. 4. Selection of inappropriate graph may lead to erroneous conclusions and decisions. 5. A high degree of variability between the values of geographic data may obstruct the purpose of graphical representation. 6. Representation and understanding of several numbers of geographic variables become difficult in the graph. 7. Sometimes the graph shows the difficulty to understand the inefficient, illiterate and common people. 2.5 Types of Graphical Representation of Data Graphical representation of data can be broadly classified into the following heads given in Table 2.1. 2.5.1 Bi-axial Graphs or Line Graphs or Historigram The values of two geographical elements or variables are represented along the sets of ‘X’ and ‘Y ’ axes on a reference frame. The line graphs are generally drawn to represent the time series data like temperature, rainfall, birth rates, death rates, growth of population etc. 2.5 Types of Graphical Representation of Data 57 Table 2.1 Types of graphs The values of different geographical variables change over time. A series of observations recorded in accordance with the time of occurrence is called time series data. The graphical representation of classified and summarized time series data is called historigram in which time is considered as independent variable and the corre- sponding geographical values are taken to be dependent variable. For the comparison of temporal changes of two or more variables expressed in the same unit of measure- ment, two or more historigrams are drawn. There are numerous geographical data which can effectively and successfully be represented by historigram. Time series graph or historigram indicates two important facts of geographical data: (1) Measurement and analysis of the changes of uni-variate geographical data. (2) Comparison of changes of two or more geographical variables. For the construction of historigram, time (year,month, day etc.) is shown along the ‘X’-axis and the corresponding geographical variable (temperature, rainfall, number of landslide hazard, volume of water discharge in a river, number of population, volume of population migration, amount of agricultural or industrial production etc.) is shown along the ‘Y ’-axis following a suitable scale. Plotting of the values of any geographical phenomenon or event with respect to time provides some points which are then joined by a line called line graph or historigram (Figs. 2.6 and 2.7). For example, the increase of population in Kolkata Urban Agglomeration (KUA) with the advancement of time (Table 2.2) can be represented by historigram. Here, different years are shown on the ‘X’-axis and the total population are shown on the ‘Y ’-axis and then the plotted points are joined by a line (Fig. 2.6). Similarly, the variation of rice production in different years (Table 2.3) can be represented by historigram. Here, different years are shown on the ‘X’-axis and the amount of production of rice are shown on the ‘Y ’-axis and then the plotted points are joined by a line (Fig. 2.7). 58 2 Representation of Geographical Data Using Graphs Fig. 2.6 Line graph (Historigram) showing the temporal changes of total population in Kolkata Urban Agglomeration (KUA) Source Census of India Fig. 2.7 Line graph or Historigram (Production of rice in India, 2000–2011) Source Directorate of Economics and Statistics (Government of India) 2.5.1.1 Open Line Graph Simple Line Graph When the line graph represents the values of only a single variable or element or fact, then it is called a simple graph. Arithmetic Graph Use of arithmetic or linear scale on the horizontal (X-axis) and vertical (Y-axis) axes to represent geographical data using line graph is more frequent and common. 2.5 Types of Graphical Representation of Data 59 Table 2.2 Data for line graph or historigram (Temporal change of total population in Kolkata Urban Agglomeration) Year Total population (in millions) 1901 1.51 1911 1.74 1921 1.88 1931 2.14 1941 3.62 1951 4.67 1961 5.98 1971 7.42 1981 9.19 1991 11.02 2001 13.21 2011 14.03 Table 2.3 Data for line graph or historigram (Production of rice in India, 2000–2011) Year Production of rice (million tons) 2000–01 84.98 2001–02 93.34 2002–03 71.82 2003–04 88.53 2004–05 83.13 2005–06 91.79 2006–07 93.36 2007–08 96.69 2008–09 99.18 2009–10a 89.13 2010–11b 80.41 aFourth advance estimates as released on 19.07.2010 bFirst advance estimates as released on 23.09.2010 Source Directorate of Economics and Statistics (Government of India) On an arithmetic scale, equal amounts or values are represented by equal distances, i.e. the values of a data series plotted on an arithmetic scale increase or decrease at a constant rate (even spaces between numbers). Thus, the distance from a value of 1 to 2 (distance is 1) is equal to that of the distance from 2 to 3 (distance is 1), 3 to 4 (distance is 1), 4 to 5 (distance is 1) and so on (Fig. 2.8a). Representation of data on an arithmetic or a linear scale would produce a curving line, descending at a declining (getting lower) angle for a diminishing series of values and ascending at a rising (getting higher) angle for a growing series of values. The major advantages and disadvantages of using arithmetic graphs are: 60 2 Representation of Geographical Data Using Graphs Fig. 2.8 a Arithmetic scale on both the axes, bArithmetic scale on the ‘X’-axis but the logarithmic scale on the ‘Y ’-axis, c Arithmetic scale on the ‘Y ’-axis but the logarithmic scale on the ‘X’-axis, and d Logarithmic scale on both the axes Advantages (1) Presentation of geographical data with a line graph on arithmetic scales is very easy and simple because basic mathematical principles are applied. (2) Arithmetic line graphs are very easy to read and understand.Most of the readers expect a level twice as high to be twice as large. (3) Zeroes or negative values can be easily represented on arithmetic scales. (4) These graphs are useful for the representation and understanding of the absolute changes of values of geographical variables. 2.5 Types of Graphical Representation of Data 61 Disadvantages (1) These graphs only show the absolute changes of values; however, these do not show the relative changes. Thus,these are not useful for comparing the relative changes (percentage) of values of geographical variables. Logarithmic Graph Representation of geographical data with a line graph on a logarithmic scale (equal scale between powers of 10) is an alternative andmore useful technique in comparing the rate of change of values. These graphs aremore useful and effective to understand and compare the relative changes (percentage) of a set of values rather than their absolute amounts of changes. Log-graph is commonly used when the range of values of the variable is very large and an increase or decrease of the values occurs roughly at a constant ratio. On a logarithmic scale, equal distances stand for equal ratios. For instance, the distance from 1 to 2 is equal to that from 2 to 4 ( 2 4 = 1 2 ) , 4 to 8 ( 4 8 = 1 2 ) , 8 to 16 ( 8 16 = 1 2 ) and so on at each interval, in the ratio 1:2 (vertical axis of Fig. 2.8b, horizontal axis of Fig. 2.8c and both axes of Fig. 2.8d). Representation of data on a logarithmic scale clearly depicts the percentage increase or decrease between two data values. In log-graph paper, the axes (either X-axis or Y-axis or both) are divided into several parts of equal length, known as cycles. A single cycle corresponds to a tenfold increase of values of variables, and similarly, two cycles indicate the 100- fold increase of values. The value at the top of the first cycle is ten times more than that of the value at the bottom of it and the value at the top of the second cycle is ten times more than the value at the bottom of the second cycle (the top of the first cycle), i.e. hundred times more than that of the value at the bottom of the first cycle (vertical axis of Fig. 2.8b, horizontal axis of Fig. 2.8c and both axes of Fig. 2.8d). It is because of the principle that a common logarithm is a power to which 10 should be increased to generate a specified number. Thus, 100 = 102, 1000 = 103 and the logarithm of 100 and 1000 are 2 and 3, respectively, and so on. Log scale can never start with zeroes or negative values, as log (0) = ∞ (infinity). So, any positive value should be taken at the origin by the user. Based on the selection of logarithmic scale (either on the X-axis or Y-axis or both the axes), log-graphs are of two types. Semi-logarithmic Graph A semi-logarithmic or semi-log line graph has one axis on a logarithmic scale (equal scale between powers of 10) and another axis on an arithmetic or linear scale (even spaces between numbers) (Fig. 2.8b, c). These graphs are useful for the data with exponential relationships, or where a single variable covers a large range of values. A set of geographical data plotted using a logarithmic scale on the y-axis will resemble a straight line, slanting up or down based on the nature of the data values. When the values increase or decrease at a constant rate, it will appear as a straight line. 62 2 Representation of Geographical Data Using Graphs Table 2.4 Database for arithmetic and logarithmic line graph (Age and sex-specific variation of death rates) Age group Number of deaths (per year) Male Female <15 15 20 15–19 17 20 19–24 23 24 25–29 27 45 30–34 33 105 35–39 60 210 40–44 110 318 45–49 235 480 50–54 470 625 55–59 820 820 60–64 1340 1205 65–69 2110 1508 70–74 2905 1750 75–79 3380 1820 80–84 3385 2010 84+ 2000 1325 Log–Log Graph In log–log graph, both the X (horizontal) and Y (vertical) axes are represented using the logarithmic scale in which equal distances measure equal ratios (Fig. 2.8d). The given two line graphs (Figs. 2.9 and 2.10) show the difference between the two scales when representing the same data, i.e. age-specific number of deaths of the male and female population (Table 2.4). The first line graph (Fig. 2.9) has been drawn using the arithmetic scale on both axes. The graph demonstrates that female death rates are slightly or little higher than the male death rates until about age group 50–54. In the age group 55–59, the male death rate has started to exceed the female death rate and in the age group 65–69, the rate becomes much higher and staysmuch higher. However, the second line graph (Fig. 2.10) has been drawn using a logarithmic scale on the ‘y’-axis. In this graph, the female death rates for the younger age groups appear somewhat higher in comparison to the male death rates and the relative (percentage) differences in the death rates for the older age groups are not as higher as apparent in the arithmetic line graph. So, the logarithmic graph highlights a possible significant difference between the death rates of the male and female population in the younger age groups, whereas this difference has been lost in the arithmetic line graph because of the plotting of the higher absolute values for the older age groups. 2.5 Types of Graphical Representation of Data 63 Fig. 2.9 Arithmetic graph (Number of male and female deaths per year) Fig. 2.10 Logarithmic graph (Number of male and female deaths per year) Advantages and Disadvantages of Using Logarithmic Graph Advantages 64 2 Representation of Geographical Data Using Graphs (1) Logarithmic line graphs show the relative changes, i.e. these graphs are useful to identify and understand the relative changes (percentage) of values of geographical variables. (2) In terms of comparative study, the logarithmic line graphs provide a more complete depiction and explanation of the relationship that exists between sets of data. Disadvantages (1) Representation of zeroes or negative values on logarithmic line graphs is not possible. Also, these graphs are not so easy to construct like arithmetic graphs. (2) Since most of the readers and users expect a level twice as high to be twice as large, these types of graphs may be misleading to them because they do not understand what type of comparison is shown in graphs. (3) These types of graphs look very technical and discourage the readers and users to try to recognize and explain these graphs. Because of the aforementioned problems and difficulties, though logarithmic line graphs provide a more complete depiction and explanation of the relationship between series of data, these graphs are not widely recommended and used to repre- sent geographical data. These graphs are not suitable to display the geographical data in those fields and reports where common people are the targeted audience. These graphs are excellent for specialized purposes and are for audiences having adequate technical knowledge. In comparison, arithmetic line graphs are commonly and frequently used for the representation of geographical data because these graphs are simple and easy to understand. Difference Between Arithmetic (Linear) and Logarithmic Line Graphs The major differences between arithmetic and logarithmic line graphs are: Arithmetic line graph Logarithmic line graph 1. Arithmetic or linear scale is used on both the axes, i.e. on the X-axis (horizontal) and Y-axis (vertical) to represent the data 1. Logarithmic scale is used either on the X-axis (horizontal) or Y-axis (vertical) or both axes to represent the data 2. Arithmetic line graph can be used to represent any type of geographical data 2. Logarithmic line graph is commonly used when the range of values of the data set is very large 3. On an arithmetic scale, equal distances represent equal amounts or values, i.e. the values of a data series plotted on an arithmetic scale increase or decrease at a constant rate 3. On a logarithmic scale, equal distances represent equal ratios, i.e. the values of a data series plotted on a logarithmic scale increase or decrease at a constant ratio 4. Zeroes or negative values can be easily represented on arithmetic line graphs 4. Representation of zeroes or negative values on logarithmic line graphs is not possible (continued) 2.5 Types of Graphical Representation of Data 65 (continued) Arithmetic line graph Logarithmic line graph 5. Representation of data on an arithmetic scale would producea curving line, descending at a declining (getting lower) angle for a diminishing series of values and ascending at a rising (getting higher) angle for a growing series of values 5. A set of data plotted using a logarithmic scale on the y-axis will resemble a straight line, slanting up or down based on the nature of the data values. When the values increase or decrease at a constant rate, it will appear as a straight line 6. Arithmetic line graphs are very easy to read and understand to the common people because basic mathematical principles are applied 6. These types of graphs look very technical and create difficulties to understand to the common people. They are excellent only for the specialized audiences having adequate technical knowledge 7. These graphs only show the absolute changes of values but not the relative changes. Thus, these are not useful for comparing the relative changes (percentage) of values of geographical variables 7. Logarithmic line graphs provide a more complete depiction and explanation of the relationship between sets of data. These graphs are useful to identify and understand the relative changes (percentage) of values Composite or Compound Line Graph Sometimes the line graph shows the relationship between two or more than two variables or elements or facts called composite or compound graph. Poly Graph Poly graph is a multiple line graph in which two or more sets of variables are repre- sented by distinctive lines (Fig. 2.11). It is frequently used for immediate compar- ison between several sets of variables, for instance, the death rates and birth rates of different states in a country; male and female literacy rate in different census years in a country (Table 2.5); proportion (%) of child, adult and old population in different census years in a country; amount of production of different crops (rice, wheat, maize, pulses etc.) in different years in a region etc. Generally, different vari- ables are represented by different line patterns like a straight line (___), dotted line (……), broken line (- - -) or line of various colours (Fig. 2.11). Band Graph A band graph is practically a standard and aggregate line graph which shows the trends of values in percentage or numbers or quantity for successive time periods in both the total and its component parts (Table 2.6) by a series of lines drawn on the same frame (Fig. 2.12). Band graph shows how and in what proportion the component items constituting the aggregate are distributed. Different component items are represented one above the other and the intervening gaps between the 66 2 Representation of Geographical Data Using Graphs Fig. 2.11 Poly graph showing total, male and female literacy rates Table 2.5 Worksheet for poly graph (Total, male and female literacy rates in different census years in India) Census year Literacy rate (%) Scale selected Literacy rate according to scale (cm) Total Male Female Total Male Female 1901 5.35 9.83 0.60 1 cm to 10% literacy rate 0.53 0.98 0.06 1911 5.92 10.56 1.05 0.59 1.06 0.1 1921 7.16 12.21 1.81 0.72 1.22 0.18 1931 9.5 15.59 2.93 0.95 1.56 0.29 1941 16.1 24.9 7.30 1.6 2.5 0.73 1951 16.67 24.95 7.93 1.7 2.5 0.79 1961 24.02 34.44 12.95 2.4 3.4 1.29 1971 29.45 39.45 18.69 2.9 3.9 1.87 1981 36.23 46.89 24.82 3.6 4.7 2.48 1991 42.84 52.74 32.17 4.3 5.3 3.2 2001 54.51 63.23 45.15 5.4 6.3 4.51 2011 64.32 71.22 56.99 6.4 7.1 5.7 Source Census of India successive lines are filled by different colours or shades so that the graph looks like a series of bands (Fig. 2.12). When the differences in values in component parts are small, then the band graph becomes impressive representing the trends of their distribution but when the variations are too large then the band graph becomes less 2.5 Types of Graphical Representation of Data 67 Ta bl e 2. 6 W or ks he et fo r ba nd gr ap h (P ro du ct io n of di ff er en tc ro ps in In di a) Y ea r Pr od uc tio n (m ill io n to ns ) To ta l( m ill io n to ns ) Sc al e se le ct ed Pr od uc tio n ac co rd in g to sc al e (c m ) R ic e W he at C er ea ls Pu ls es Fo od gr ai ns R ic e W he at C er ea ls Pu ls es Fo od gr ai ns 20 04 –0 5 83 .1 68 .6 18 5. 2 13 .1 19 8. 4 54 8. 4 1 cm to 10 0 m ill io n to ns 0. 83 0. 69 1. 85 0. 13 1. 98 20 10 –1 1 96 .0 86 .9 22 6. 3 18 .2 24 4. 5 67 1. 9 0. 96 0. 87 2. 26 0. 18 2. 44 20 11 –1 2 10 5. 3 94 .9 24 2. 2 17 .1 25 9. 3 71 8. 8 1. 05 0. 95 2. 42 0. 17 2. 59 20 12 –1 3 10 5. 2 93 .5 23 8. 8 18 .3 25 7. 1 71 2. 9 1. 05 0. 93 2. 39 0. 18 2. 57 20 13 –1 4 10 6. 7 95 .9 24 5. 8 19 .3 26 5. 0 73 2. 7 1. 07 0. 96 2. 46 0. 19 2. 65 20 14 –1 5 (4 th A dv E st .) 10 4. 8 88 .9 23 5. 5 17 .2 25 2. 7 69 9. 1 1. 05 0. 89 2. 35 0. 17 2. 52 So ur ce D ir ec to ra te of E co no m ic s an d St at is tic s, M in is tr y of A gr ic ul tu re an d Fa rm er s W el fa re 68 2 Representation of Geographical Data Using Graphs Fig. 2.12 Band graph showing the production of various crops in different years in India Source Directorate of Economics and Statistics, Ministry of Agriculture and Farmers Welfare impressive and its legibility and clarity is marred. In the geographical study, band graph is useful for different purposes, including dividing the total crop production into different crops, total cost into component costs, total production by type of commodity or industries and other such relationships. 2.5.1.2 Closed Line Graph Climograph The concept and the idea of climograph was first conceived by J. Ball in the form of ‘Climatological Diagrams’ (Singh and Singh 1991) and it was introduced by Griffith Taylor in the first half of the twentieth century (1949). The variations of world climatic conditions were summarized by Koppen using this graph. Again, J.B. Leighly explained the idea of Koppen to compare the climatic conditions of different parts of the world. Additionally, another two important types of climograph were 2.5 Types of Graphical Representation of Data 69 Fig. 2.13 USDA type of climograph designed by the United States Department of Agriculture (U.S.D.A., 1941) and E.E. Foster (1944). The climograph was actually devised to show the scale of habitability for white settlers within the tropics. Climograph of USDA Type (1941) TheUnited States Department of Agriculture devised a type of climograph in 1941 in whichmeanmonthlywet-bulb temperatures (°F) is plotted againstmeanmonthly dry- bulb temperatures (°F) on a referenced frame. Twelve points, each for one month, are obtained on the graph paper and the joining of these points results in a closed 12-sided polygon called climograph. Generally, this type of climograph is depicted to explain the climatic conditions with respect to human physiological comfort (Fig. 2.13). Climograph of Foster Type (1944) In 1944, E.E. Foster devised a type of climograph in which mean monthly tempera- tures (°F) is plotted against those of monthly precipitation (inches) on a referenced 70 2 Representation of Geographical Data Using Graphs Fig. 2.14 The base frame of Foster’s climograph frame.Monthly precipitation (rainfall) is shown along the ‘X’-axis (abscissa), gradu- ated from 0 to 18 inch and the meanmonthly temperature is shown along the ‘Y ’-axis (ordinate), graduated from −20 to 100 °F (Fig. 2.14). The reference frame has been divided into six temperature zones from bottom to top, namely frigid zone (−20–0 °F), cold zone (0–32 °F), cool zone (32–50 °F), mild zone (50–65 °F), warm zone (65–80 °F) and hot zone (more than 80 °F). Additionally, the top four zones are divided into five sub-zones based on the amount of precipitation, namely arid zone (0.32–1.03 inch), semi-arid zone (0.59–1.93 inch), sub-humid zone (1.10–3.60 inch), humid zone (2.05–6.73 inch) and wet zone (more than 2.05 inch and more than 6.73 inch) (Fig. 2.14). Each month is depicted by a letter symbol and the joining of these points results in a closed 12-sidedpolygon called climograph. This type of climograph is primarily used to depict the climatic classification system proposed by C.W. Thornthwaite. Climograph of G. Taylor (1949) According to G. Taylor, climograph is a 12-sided polygon obtained from the graph- ical representation of wet-bulb temperature (°F) and the relative humidity (%) of 12 months of a particular place or station it corresponds to (Fig. 2.15 and Table 2.7). The relative humidity is shown along the ‘X’-axis (abscissa), graduated from 20 to 100% and the wet-bulb temperature is shown along the ‘Y ’-axis (ordinate), gradu- ated from −10 to 90 °F. The 12 points (each for a month) are obtained on the graph paper by plotting wet-bulb temperature against relative humidity for 12 months and 2.5 Types of Graphical Representation of Data 71 Fig. 2.15 Climograph showing the wet-bulb temperature and relative humidity of Kolkata (after G. Taylor) Table 2.7 Monthly wet-bulb temperature (°F) and relative humidity (%) of Kolkata, West Bengal Months Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Wet-bulb temperature (°F) 64.7 68.5 70.5 79.8 82.9 82.4 80.8 82.6 80.4 78.1 68.9 68.5 Relative humidity (%) 41 45 40 39 59 67 83 80 76 70 46 49 joining of these points results in a 12-sided polygon called climograph. He marked four corners in the framework demonstrated by Keen (SW), Raw (SE), Muggy (NE) and Scorching (NW) (Fig. 2.15). (a) Keen: Low wet-bulb temperature (below 40 ◦F) and low relative humidity (below 40%). (b) Raw: Low wet-bulb temperature (below 40 ◦F) and high relative humidity (above 70%). (c) Muggy: High wet-bulb temperature (above 60 ◦F) and high relative humidity (above 70%). (d) Scorching: Highwet-bulb temperature (above 60 ◦F) and low relative humidity (below 40%). He also designed a tentative scale of discomfort and identified six categories regarding it: (1) Very rarely uncomfortable: below 45 °F (40–45 °F) 72 2 Representation of Geographical Data Using Graphs Table 2.8 Mean monthly temperature and rainfall of Burdwan district, West Bengal Months Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Mean temperature (°C) 19.3 21.7 26.1 29.7 31.7 30.2 29.6 29.1 28.4 28.2 24.1 18.6 Rainfall (cm) 0.54 2.12 8.54 8.93 9.26 24.1 23.8 36.53 23.7 6.74 2.52 0 (2) Ideal: 45–55 °F (3) Rarely uncomfortable: 55–60 °F (4) Sometimes uncomfortable: 60–65 °F (5) Often uncomfortable: 65–70 °F (6) Usually uncomfortable: above 70 °F (70–75 °F). The graph shifting towards the corners indicates the discomfortable characteristics of the climate. ‘Scorching’ and ‘Keen’ zones represent the hot desert and cold climatic characteristics, respectively,whereas the ‘Muggy’ region indicates the tropical humid climate (Fig. 2.15). The shape of the climograph is useful to understand the climatic character of a place or station. It is very simple and easy to compare the unknown climates with reference to the shape of the typical climograph (Singh and Singh 1991). • Spindle-shaped climograph indicates the dry continental climate. • Diagonal climograph represents the Mediterranean type of climate. • Diagonal elongated climograph indicates the monsoon type of climate. • Full-bodied climograph represents the ideal British type of climate. Hythergraph Hythergraph, a special form of climograph, was devised by Taylor (1949) to show the relationship between mean monthly temperature and mean monthly rainfall. It is drawn in the same way as in the case of climograph. Mean monthly rainfall is represented along the ‘X’-axis (abscissa) and mean monthly temperature is shown along the ‘Y ’-axis (ordinate). The 12 points (each for a month) are obtained on the graph paper by plotting mean monthly temperature against mean monthly rain- fall for 12 months and joining of these points results in a 12-sided polygon called hythergraph. Table 2.8 shows themeanmonthly temperature and rainfall of Burdwan district, West Bengal and these data are graphically represented using a hythergraph (Fig. 2.16). 2.5 Types of Graphical Representation of Data 73 Fig. 2.16 Hythergraph showing the mean monthly temperature and rainfall of Burdwan district Significance of Hythergraph 1. Hythergraph is principally used for comparing the climatic characteristics of different regions as affecting the cultivation of various crops, like rice, wheat, pulses, cotton etc. 2. It summarizes the basic climatic differences with respect to human activity, specifically in the context of settlement. 2.5.2 Tri-axial Graphs Values of three geographical things or elements are represented along the sets of ‘X’, ‘Y ’ and ‘Z’ axes on a reference frame. These graphs are very useful to represent three inter-related variables. 74 2 Representation of Geographical Data Using Graphs 2.5.2.1 Ternary Graph A ternary graph is an equilateral triangular graph that displays the proportion of three inter-related components or variables that sum to a constant (100%). Three components or variables are shown along three sides of the triangle, respectively. Each side of the triangle is graduated from 0 to 100% to represent tri-component data. The vertices of the triangle are given by (1, 0, 0), (0, 1, 0) and (0, 0, 1), i.e. each apex forms 0% on two scales and 100% on the third (Fig. 2.17). Techniques and Principles of Representation of Data in Ternary Graph Ternary graph becomes very useful whenever the data composed of three inter- related components or variables can be converted into percentages totalling 100. In this graph, data are represented by using barycentric co-ordinates (barycentric Fig. 2.17 Identification of position of points in ternary graph 2.5 Types of Graphical Representation of Data 75 Table 2.9 Database for ternary graph (Proportion of sand–silt-clay in sediments) Stations Proportion of sediment particles (%) Sand Silt Clay Kolaghat 68 25 7 Soyadighi 65 17 18 Anantapur 52 30 18 Pyratungi 60 12 28 Dhanipur 45 35 20 Geonkhali 54 16 30 co-ordinates are triples of numbers [x1, x2, x3]) corresponding to amounts placed at the vertices of a reference triangle (say �ABC). These amounts then determine the location of a point ‘P’, which is the geometric centroid of the three amounts and is identified with co-ordinates (x1, x2, x3) (Fig. 2.17). Since the values of three components all add up to 100%, all three values are plotted on the graph as a collection of points. The position of points within the triangular graph reflects the relative dominance of each component (Fig. 2.17). Example Data like age composition (young, adult and old), textural composition of soil or sedi- ment (sand, silt and clay), occupational structure of population (primary, secondary and tertiary) etc. are suitable for representation in ternary graph. Table 2.9 shows the proportion of sand, silt and clay in the sediment samples collected from the bed of Rupnarayan River at different sites (Kolaghat, Soyadighi, Anantapur, Pyratungi, Dhanipur and Geonkhali) and it is represented graphically in a ternary graph (Fig. 2.18). Types of sediment samples are easily understood by observing the location of points in the ternary graph. A ternary graph was, however, found the most appropriate technique for the classification of a large number of Indian towns. Asok Mitra (1964) for the first time used the ternary graph for functional classification of towns in the 1961 census. His method of classification is based on the concept of dominant functions of a city. The seven census categories of workers were grouped into three broad non- agricultural categories, namely (1) industry, (2) trade and transport and (3) services. The percentages of three categories of towns are then plotted on a ternary graph and their position in the triangle was taken as the main determinant of their functional classification. 76 2 Representation of Geographical Data Using Graphs Fig. 2.18 Identification of sediment type using ternary graph 2.5.3 Multi-axial Graphs The reference frame is composed of a network of evenly spaced linesradiating from the centre. The radiating lines are drawn at true azimuth on vector graph. Values are plotted along the radiating lines and the obtained points are then joined. 2.5.3.1 Radar or Spider or Star Graph A radar graph, also called spider graph or star graph, is the graphical representation of the multivariate data (three or more variables) in the form of a two-dimensional graph consisting of a sequence of equi-angular spokes or axes starting from the same point called radii, each representing a different variable. The length of each spoke is proportional to the quantity of the variable for the data point with respect to the maximum quantity of the variable across all data points. The spokes are then joined with a line of a selected colour or pattern in the form of a shaded polygon to represent each category, creating a star-like shape with points equal to the number of categories (Fig. 2.19). A radar graph provides the user with numerous visual comparisons by portraying multivariate data with different variables. In other words, if we want to understand 2.5 Types of Graphical Representation of Data 77 Fig. 2.19 Radar graph (Production of different crops) how multiple data points interact with each other and make a comparison against another set of multiple data points, then a radar graph is one of the best ways. This graph is mainly drawn to display the continuous diurnal, monthly or annual rhythm of different geographic variables. For example, hourly data of atmospheric humidity, atmospheric temperature, sunshine and soil temperature; monthly data of atmospheric humidity, rainfall, atmospheric temperature; and yearly data of produc- tion of different crops, industrial goods etc. can be easily represented in a radar graph. Methods of Construction 1. A suitable scale is at first selected to represent the data, and the required number of concentric circles is drawn at regular intervals (Fig. 2.19). 2. The required number of equi-angular radial straight lines (axes) of corre- sponding lengths is drawn from the centre (i.e. the point of origin). Each axis shares the same divisions and scale, but the method in the range of variable values maps to this scale may be different between the represented variables. 3. The values from a single observation are represented along each axis and joined to form a polygon to make the graph more easily readable and understandable (Fig. 2.19). 4. More number of observations can also be placed in a single graph with the help of multiple polygons. 78 2 Representation of Geographical Data Using Graphs Table 2.10 Data for radar graph (Production of different crops in different years) Year Production of different crops (tonne) Grains Pulses Vegetables Cereals 2007 850 1143 885 847 2008 1284 691 1295 980 2009 1170 680 1409 1272 2010 974 1182 599 1310 2011 554 1405 972 1354 2012 1491 453 937 568 2013 820 572 1085 1004 2014 584 439 802 885 2015 1319 762 974 726 2016 1270 1277 1214 815 5. Overlay these polygons and adjust the opacity of every polygon for each of the observations. Hence, an hourly graph is represented by a 24-sided polygon and a monthly graph by a 12-sided polygon. Steps of Drawing Radar Graph in Microsoft Excel Step 1: Insert the data in a suitable format. Step 2: Go to Insert tab → Other Charts → Select Radar with Marker Chart. A blank radar graph will be inserted. Step 3: Right click on the blank graph and click on select data. Step 4: Click on Add button Step 5: Select Series name as Grains (for the following example) and Series value as production values (Table 2.10) and click Ok. Step 6: Repeat the same procedure for all the remaining data. After this, click on Ok and a graph will be inserted (Fig. 2.19). Step 7: Format the graph according to your need. How to Understand the Radar Graph Like the column or bar graph in the spider graph we also have ‘X’ and ‘Y ’ axes. The X-axis is nothing but the extreme end of the radial line (spider) and each step of the radial line is considered as Y-axis. Zero (0) point of the radar graph starts from the centre of the wheel. Towards the margin of the spike, a point arrives, and the higher is the value. 2.5 Types of Graphical Representation of Data 79 Interpretation of the Graph • By having a look at the spider graph (Fig. 2.19) we can understand that in 2012 the production of grains was the highest among all the 10 years productions. In 2011, the production of grains was the lowest. • In the case of pulses, the highest productionwas in 2011 and the lowest production was in the year 2014. • In the case of vegetables, the highest production was in the year 2009 and the lowest production was in 2010. • In the case of pulses, the highest production was in the year 2011 and the lowest production was in the year 2012. Advantages of Using Radar Graph Radar or spider graphs are frequently used in the geographical analysis for the comparison of the distributions along the radial lines of different directions of frequency and index data to compare two or more areas. The major advantages of using this graph are as follows: 1. The radar graph is more suitable when the absolute values aren’t critical for a user but the whole graph tells some story. 2. Radar graphs are very useful for strikingly showing outliers and commonality, or when one graph is larger in every variable than another. 3. Several attributes become easily comparable, each along their own axis, and their overall variations are clear by the shape and size of the drawn polygons. 4. In radar graph, many variables can be easily shown next to each other whilst still giving each variable the identical resolution. 5. Radar graphs are more effective when there is the need to compare the perfor- mance of one thing to a standard or a group’s performance. For example, if one has a radar graph portraying data about the average quantity of production of crops in different regions, one could easily superimpose another polygon repre- senting a particular crop production data in order to quickly observe how that crop compares to average crop production in each region (Fig. 2.19). Limitations The major disadvantages of this graph are: (1) The comparison of data on a radar graph becomes difficult once there are more than two webs on the graph. (2) When too many variables are represented along different axes, it creates the crowding of data. (3) Though there are several axes which have gridlines joining them for indication, problems arise when observers seek to compare the values along different axes. 80 2 Representation of Geographical Data Using Graphs (4) Each axis of a radar graph shares a common scale, which means that the range of values of each variable requires to be mapped based on this shared scale in a different way. The way of mapping these variables is not understandable in most cases, and can even be misleading. (5) Another important problem is that observers could potentially think that the area of the polygons is a very important thing to consider. But, the shape and area of the polygons can vary largely based on how the axes are placed around the circle. Alternatives to radar graphs are bar graphs and parallel coordinate graphs. 2.5.3.2 Polar or Rose Graphs A polar or rose graph is the graphical representation of the direction as well as magnitude or quantity of different phenomena or variables, especially geographical in nature. Phenomena or variables characterized by direction and distance from a specific point of origin can be plotted on polar graphs (Figs. 2.20 and 2.21). They are analogous to radar graphs, but as a substitute for any variable. They exclusively emphasize geographical phenomena. Fig. 2.20 Wind rose graph showing the percentage of days wind blowing from different directions 2.5 Types of Graphical Representation of Data 81 Fig. 2.21 Polar graph showing the number of corries facing towards different directions Principles and Methods of Construction In polar or rose graph,values are plotted as radii from a point of origin (pole) with the help of a polar coordinate system. This graph draws the ‘X’ and ‘Y ’ co-ordinates in each series as (theta [θ], r) (discussed in types of co-ordinate section), where theta is the amount of rotation from the origin (vector angle) and ‘r’ is the distance from the origin (radius vector). The outer values in the circle always represent the degrees in the circle. Data ‘X’ holds the x-axis position in degrees and data ‘Y ’ holds the position of each phenomenon of the variable on the y-axis (Figs. 2.20 and 2.21). These graphs are especially useful where vector values are involved. Polar or rose graphs are useful for the analysis of various geographical data containing magnitude and direction values. These graphs are generally used to repre- sent the direction, magnitude and frequency of ocean or wind waves, the direction of facing of cirques or corries, the orientation of the long axes of pebbles or boulders etc. Wind rose graph is the most frequently used polar graph in geographical analysis. Meteorologists, climatologists and geographers use wind rose to graphically display wind speed and wind direction at a particular location over a defined observation period (Table 2.11 and Fig. 2.20).Wind rose can be prepared for month-wise, season- wise or yearly as required. It typically uses 16 cardinal directions, such as north (N), north-east (NE), south (S), east (E) etc., although they may be sub-divided into as many as 32 directions. In terms of the measurement of angle in degrees, north corresponds to 0°/360°, east to 90°, south to 180° and west to 270°. Figure 2.20 shows the percentage of days wind blowing from different directions. It is clear from the graph that only 6% of winds are in a calm condition. About 29% of 82 2 Representation of Geographical Data Using Graphs Table 2.11 Percentage of days wind blowing from different directions Wind direction Percentage of days wind blowing from this direction North 29 North-east 7 East 10 South-east 12 South 10 South-west 9 West 4 North-west 13 Calm 6 Table 2.12 Data for polar graph (The orientation of corries in a glacial region) Orientation of corries Number of corries facing towards this direction North 30 North-east 60 East 40 South-east 50 South 0 South-west 10 West 0 North-west 20 winds are blowing from the northerly direction, followed by 13% from north-west, 12% from south-east, 10% from south, 9% from south-west etc. So, there is enough variability in the directions of wind blowing. Table 2.12 and Fig. 2.21 show that most of the corries in the glacial region are facing towards north, east and south-east directions. It is evident from the figure that the faces of 60 corries are in the north-easterly direction, 40 corries are in the easterly direction and 50 are in the south-easterly direction. Advantages and Disadvantages of the Use of Polar or Rose Graph Advantages 1. Multiple sets of data can be easily compared. 2. Lots of data can be represented on a single graph. 3. Easy to understand and interpret. 4. Individual components within the graph can be easily compared. 2.5 Types of Graphical Representation of Data 83 Disadvantages 1. Linking the data and statistical tests is difficult. 2. It is hard to spot anomalies. 3. Difficult to consider a suitable scale. 2.5.4 Special Graphs 2.5.4.1 Scatter Graph Scatter graph is the simplest and easiest way to show the relationship between two variables (bi-variate data) at a glance. Bi-variate data is the data that deals with the simultaneousmeasurement of two variables that can change and are compared to find the relationships. In bi-variate data, one variable is influenced by another variable and thus bi-variate data has an independent (X) and a dependent variable (Y ) (Table 2.13). It is because of the fact that the change of one variable depends on the change of the other. An independent variable is a part of data or condition in an experiment that can be changed or controlled. A dependent variable is a part of data or condition in an experiment that is affected or influenced by an external factor, most frequently the independent variable. Bi-variate data can be easily represented in a scatter graph to understand the type and nature of co-relation that exists between them. In this method, the independent variable is shown along the ‘X’-axis and the dependent variable is shown along the ‘Y ’-axis. Scattered points are obtained by putting the values of Y with respect to X. In case of a trend or correlation, a ‘line of best fit’ can then be drawn within a scatter graph (Fig. 2.22). For example, the relationship between height frommean sea level and the number of settlements, the relationship between basin area and run-off etc. can be easily represented using scatter graph. Table 2.13 Database for scatter graph (The distributions of air temperature in the month of April around an urban area) Distance from CBD (km) [X] Air temperature (°C) [Y] 1.3 41.25 3.7 41.02 5.2 40.39 7.1 39.87 9.7 39.58 11.5 39.01 17.5 38.09 18.2 37.73 21.7 35.25 25.7 32.80 84 2 Representation of Geographical Data Using Graphs Fig. 2.22 Scatter graph (Relation between the distance from CBD and air temperature) Fig. 2.23 Positive, negative and no co-relation The location and orientation of points in the scatter graph indicate the type and nature of co-relation between variables. Positive, Negative and Zero Co-relation When the points are oriented from lower left to upper right then it indicates the positive co-relation between the variables (Fig. 2.23a). Here, two sets of data or variables steadily tend to move together in the same direction, i.e. an increase in the value of one set of variables causes an increase in the value of another set of variables and vice versa. For example, distances travelled and amount of transport cost, the slope of land and rate of soil erosion, income and expenditure of a family etc. are positively co-related. When the values of both the variables tend to move together in the same direction with a constant proportion then it is known as a perfect positive correlation. In this co-relation, all the plotted points lie on a straight line that rises 2.5 Types of Graphical Representation of Data 85 Fig. 2.24 Perfect positive and negative co-relation from the lower-left corner to the upper-right corner. Numerically, it is indicated as +1 (r = +1) (Fig. 2.24a). On the other hand,when two sets of data or variables steadily tend tomove together in the opposite direction, i.e. an increase in the value of one set of variables causes a decrease in the value of another set of variables and vice versa, then it is called inverse or negative co-relation (Fig. 2.23b). For example, height from mean sea level and the number of settlements, distance from forest area and amount of organic matter in the soil, price of a commodity and its demand etc. are negatively co-related. When the values of both the variables tend to move together in the opposite direction with a constant proportion, then it is known as perfect negative co-relation. In this co- relation, all the plotted points lie on a straight line falling from the upper-left corner to the lower-right corner. Numerically, it is indicated as −1 (r = −1) (Fig. 2.24b). When one set of variables does not change even with the change of another set of variables (change in one variable does not depends on the change of another variable), then the relationship between them is called zero co-relation or non-co-relation, i.e. no co-relation exists between variables (Fig. 2.23c). For example, marks in physics and marks in geography, the height of persons and their intelligence etc. Linear and Nonlinear or Curvilinear Co-Relation Linear or nonlinear co-relation is a function of the constancy of change of ratio between two variables. In linear co-relation, the amount of change in one variable maintains a constant ratio to the amount of change in the othervariable, i.e. the ratio of change of values between two variables is equal. The points obtained from the plotting of the values of one variable with respect to the other on a graph will move around a line (Fig. 2.25a). In nonlinear or curvilinear co-relation, the amount of change of variables is not constant, i.e. the ratio of change of values between two variables is unequal. The points obtained from the plotting of the values of one variable with respect to the other on a graph will move around a curve (Fig. 2.25b). 86 2 Representation of Geographical Data Using Graphs Fig. 2.25 Linear and nonlinear co-relation 2.5.4.2 Ergograph The term ‘ergograph’ was first used by Dr. Arthur Geddes of the University of Edinburgh. An ergograph is a special kind of multivariate graph which represents the relationship between season, climatic elements and cropping patterns (human activities). Various stages of the cycle of plant growth, i.e. sowing, growing, flow- ering, maturing, harvesting etc., intimately corresponds to seasonal characteristics of weather conditions. Variation of seasons are manifested by different climatic char- acters and cropping patterns. The time of maturity of different crops varies from one another. Some crops are annual, some are bi-annual and somemay require only a few months to be matured. Ergograph can be drawn either by the Cartesian co-ordinate method (rectangular form) or by the polar co-ordinate method (circular form). In the Cartesian co-ordinate method or rectangular form, different climatic elements like mean monthly temperature, rainfall, relative humidity etc. (Table 2.14) are marked along the ‘Y ’-axis (vertical axis) in the form of poly- graphs. However, the monthly rainfall is generally represented by vertical bars. The 12 months are plotted along the ‘X’-axis (horizontal axis). Below the horizontal axis (primary baseline), a crop calendar is drawn in the form of rectangles on a selected scale to represent the acreage of different crops (Fig. 2.26 and Table 2.15). The length of each rectangle must correspond to the growing season of the crop while Table 2.14 Data for ergograph (Monthly temperature, relative humidity and rainfall of Howrah, West Bengal) Months Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Temperature (°C) 16 18 30 32 33 30 28 28 27 25 19 17 Relative humidity (%) 51 48 54 60 65 78 80 81 75 70 57 49 Rainfall (cm) 1.5 5.3 7.8 6.9 12.4 22 23 34.3 24 7.5 3.9 0.8 2.5 Types of Graphical Representation of Data 87 Fig. 2.26 Ergograph showing the relationbetween seasons, climatic elements and croppingpatterns of Howrah, West Bengal the breadth of them will be directly proportional to the crop acreage based on the selected scale. Again, each rectangle may be divided into different parts to indicate the time periods of various stages of crops grown (Fig. 2.26). 88 2 Representation of Geographical Data Using Graphs Table 2.15 Data for ergograph (Net acreage of different crops and their growing seasons ofHowrah, West Bengal) Crops Seasons Net acreage (,000) Sowing Growing Harvesting Aus April May to mid of August Mid of August to September 200 Aman July to mid of August Mid of August to October November to December 650 Boro Mid of November to December January to mid of March Mid of March to mid of April 500 Jute April May to August September to mid of October 120 Pulses Mid of October to mid of November Mid of November to mid of February Mid of February to mid of March 220 Polar Co-ordinate or Circular Ergograph of A. Geddes and G.G. Ogilvie (1938) A. Geddes and G.G. Ogilvie (1938) developed polar co-ordinate or circular form of ergograph to show the continuous rhythm of seasonal activities (Table 2.16) in which 12months of a year aremarked around the circumference of the circle, forming 30° sectors (Fig. 2.27). Concentric curves are drawn to show the nature of activities done each month and the amount of time (hours per day) assigned to each type of activity. The time scale, ranging from 0 to 24 h per day, is a square root scale and is represented along the radius of the circle (Fig. 2.27). This type of ergograph is also popularly known as a polar strata graph or polar layer graph or polar line graph (as the data form ‘bands’ on the graph). 2.5.4.3 Ombrothermic Graph Climatic graphs summarize the trends in temperature and precipitation for no less than 30 years. They are likely to establish the relationship between temperature and precipitation and determining the span of dry, wet and extremely wet periods. Ombrothermic graph, also called Walter Lieth graph, is an important climatic graph used to compare the averagedryness andwetness of a place.Thedata ofombrothermic graphmust be the average for no less than 30 years. This graphwas first designed and used by French bio-geographer and naturalist, Marcel-Henri Gaussen to graphically depict the mean monthly temperature and monthly precipitation of a place. 2.5 Types of Graphical Representation of Data 89 Ta bl e 2. 16 D at ab as e fo r ci rc ul ar er go gr ap h (R hy th m ic se as on al ac tiv iti es ) T im e de vo te d to va ri ou s ac tiv iti es (h ou r/ da y) M on th s Ja n Fe b M ar A pr M ay Ju n Ju l A ug Se p O ct N ov D ec D om es tic w or ks 4. 5 5 5 5. 5 5 5. 5 6 5. 5 5. 5 5. 5 5 4. 5 A gr ic ul tu ra la ct iv iti es 5 4. 5 6 5. 5 6. 5 6 6 6. 5 5. 5 6 6 6 A ni m al hu sb an dr y 1. 5 1 2 2 1. 5 0. 5 0. 5 0. 5 1. 5 1. 5 1 1. 5 Fi sh in g 0. 5 0. 5 1 1. 5 1. 5 2. 5 3 2 1. 5 1 1 0. 5 E nt er ta in m en t 1. 5 2 1 1 1. 5 1. 5 1 1. 5 2 2. 5 2 1. 5 O th er s in cl ud in g sl ee p 11 11 9 8. 5 8 8 7. 5 8 8 7. 5 9 10 90 2 Representation of Geographical Data Using Graphs Table 2.17 Data for ombrothermic graph (Average temperature and rainfall of Purulia district, West Bengal) Months Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec Average rainfalls (mm) 13 24 21 26 45 223 281 300 259 82 10 4 Average temperatures (oC) 18.8 21.9 26.8 31.6 33.3 31.2 28.3 28.1 28 26.5 22.2 19.4 Fig. 2.27 Circular ergograph showing the rhythm of seasonal activities (after A. Geddes and G.G. Ogilvie 1938) Principles and Methods of Construction (1) The mean monthly temperature (°C) and monthly rainfall (mm) of Purulia district, West Bengal (Table 2.17) have been represented by Ombroth- ermic graph (Fig. 2.28). For the drawing of this graph, months of the year are shown along the x-axis while one y-axis (y1) represents the mean monthly temperature and another y-axis (y2) represents the total monthly precipitation. (2) The x-axis of the graph should begin with the coldest month of the year. In the case of the places located in the Northern Hemisphere, the x-axis should start with January, whereas in the Southern Hemisphere, it should start with the month of July. (3) Mean monthly temperature and monthly precipitation should be expressed in degree centigrade (°C) and millimetres (mm), respectively. The selection of the scales (y1 and y2) is very important and it should follow the following relationship: 2.5 Types of Graphical Representation of Data 91 Fig. 2.28 Ombrothermic graph of Purulia district, West Bengal Temperature (T) = 1 2 × Precipitation (P) For example, the mean monthly temperature of 5 °C on the y1-axis have to be equal to 10 mm of total monthly precipitation on the y2-axis. (4) The selection of scales on both the axes of this graph is based on the Gaussen- Bagnouls Aridity Index (Sarkar 2015): (i) Precipitation more than three times the temperature (P > 3 T) indicates the wet period. (ii) Precipitation between two times and three times of the temperature (3 T > P > 2 T) indicates the semi-wet period. (iii) Precipitation less than two times the temperature (P < 2 T) indicates the arid or dry period. (5) Generally, the precipitation and temperature curves are represented in blue and red lines, respectively. (6) When the precipitation curve lies below the temperature curvethen it indicates a period of dry condition or xeric period, but if the precipitation curve lies above the temperature curve then it indicates a period of wet condition. When the precipitation curve exceeds 100mm then it signifies a period of an excessive wet condition (Fig. 2.28). 92 2 Representation of Geographical Data Using Graphs (7) The station name and its elevation should be mentioned in the top left, average temperature and average rainfall in the top right, and extremes of temperature in the second line should be shown. Inference: The station has a dry span between November and May and a wet span from June to October. Between the month of June and September, it is an excessive wet period. Demerits of this graph The major demerits of this graph are as follows: (1) It lays on a scale that applies only to the mid-latitude climates. (2) It can be only constructed and read by using the metric system. 2.5.4.4 Water Balance or Water Budget Curve In nature, water is almost in a constant motion due to the changes of its state from liquid to solid or vapour in appropriate environments. Law of mass conservation signifies that within a particular area for a specific period of time, the inflows and outflows ofwater are equal, including any change of storage ofwater in the concerned area, i.e. the water coming into an area has to depart the area or be stored within the area. The supply of groundwater in an area indicates whether a stage is one of water utilization, deficiency, recharge or surplus. Water balance techniques are very important for the solution of various theoretical and practical hydrological problems and disasters. This approach helps us to evaluate the water resources quantitatively and their transformation due to the influence of humanactivities.Detailed knowledge about thewater balance structure of river basins and groundwater basins offers a platform to make various hydrological projects valid for the control, redistribution and rational use of water resources with respect to time and space (Sokolov and Chapman 1974). Formulation of Water Balance Techniques Techniques of water balance estimation can be formulated by the following parameters and equations: (A) Gains: Precipitation (P) (B) Soil moisture recharge or storage (R) (C) Losses: Utilization (U) and evapotranspiration (a) Actual evapotranspiration (AE) (b) Potential evapotranspiration (PE) Simple water balance 1. Environments with abundant moisture condition 2.5 Types of Graphical Representation of Data 93 Table 2.18 Water need and supply (mm) of a region (field capacity: 100 mm) Month Jan Feb Mar Apr May June Jul Aug Sep Oct Nov Dec Supply of water (mm) 125 105 117 128 125 96 82 81 80 72 102 103 Need of water (mm) 6 12 33 66 108 149 170 156 113 63 20 12 Supply minus need +119 +93 +84 +62 +17 -53 -88 -75 -33 +9 +82 +91 Water budget section S S S S S U U/D D D R R R/S P > PE, thus AE = PE 2. Environments with inadequate moisture condition P < PE, thus AE < PE 3. Environments with seasonal moisture condition. In seasonal moisture environments, calculation of the monthly water budget is done based on the following parameters: a. Precipitation (P) b. Potential evapotranspiration (PE) c. Actual evapotranspiration (AE) d. Change in water storage (�ST) e. Difference between P and PE (P − PE) f. Deficiency of water (D) g. Soil moisture storage up to field capacity (ST) h. Surplus of water (S): After attaining field capacity (ST), excess precipitation (P) is available as surplus. A water balance equation is simply expressed as follows: P = Q + E ± �ST (2.15) where P is precipitation, Q is run-off, E is evaporation and �ST is the surface, sub-surface and groundwater storage. In seasonal moisture environments, in the annual cycle of water balance estima- tion, a period of soil moisture recharge (R) is followed by a period of water surplus (S), and subsequently, a period of soil moisture utilization (U) is followed by a period of water deficiency (D) (Sutcliffe et al. 1981) (Table 2.18). The surplus of water comprises both surface run-off and groundwater storage. 94 2 Representation of Geographical Data Using Graphs Procedures for Determining the Status of Water Availability In an annual cycle of water balance estimation, the periods of soil moisture recharge (R), water surplus (S), soil moisture utilization (U) and water deficiency (D) are identified by the following procedures: (a) In Table 2.18, month-wise water need and supply of a region for a theoretical one-year period is shown, assuming the field capacity of the soil being 100mm. In all themonths (January–May), the values of ‘Supplyminus need’ are positive (supply of water is more than need) but from the month of June to September, the negative values (need of water ismore than supply) indicate that morewater will be withdrawn from underground than that will be recharged. Therefore, June is the month from which the utilization of storage water starts. In the ‘Water budget section’ of Table 2.18, write ‘U’ for the utilization of water in June. (b) Utilization of water continues until the total of the negative values becomes 100 (field capacity of the soil is 100mm). If we add the values−53 and−88 for the months of June and July, then the total value exceeds 100. It indicates that July is the month of transition from water utilization to deficiency. Therefore, in the ‘Water budget section’ write ‘U/D’ for the month of July (Table 2.18). (c) In the ‘Water budget section’ write ‘D’ for all the months (in Table 2.18, for the months of August and September) to indicate deficiency of water (the need for water is more than supply) until the value becomes positive again. The positive values (supply of water is more than the need) indicate the recharge of water into the ground and it will be marked by the letter ‘R’ in the ‘Water budget section’ (October and November in Table 2.18). (d) Again, the recharge of water is converted to water surplus when the total of the successive positive values exceeds 100 (field capacity of the soil). In the month of December, the total of positive values exceeds 100 (9 + 82 + 91 = 182) indicating the time of transition from recharge to surplus of water. Therefore, write ‘R/S’ in the ‘Water budget section’ for that month. Then all the subsequent positive value is considered as the surplus of water and the letter ‘S’ is written in the ‘Water budget section’ (months of January–May in Table 2.18). In this way, we will be able to easily determine the status of water availability of any region for all the months in a year and rational decisions can be made on how much and when water should be allocated for different purposes. The surplus and deficiency of water of the sample study area during a normal rainfall year are shown at monthly intervals in Table 2.19 and Fig. 2.29. Month-wise rainfall (precipitation, P) and potential evapotranspiration (PE) have actually been superimposed in Fig. 2.29. This figure reveals that from themonth of January toApril amount of precipitation (P) exceeds potential evapotranspiration (PE), which is an indicative of surplus ofwater during this period. Then the potential evapotranspiration exceeds precipitation from the month of May to September. Consequently, the use 2.5 Types of Graphical Representation of Data 95 Table 2.19 Water budget estimation for a sample study area (elevation: 12 m; field capacity: 102 mm) Month Jan Feb Mar Apr May June Jul Aug Sep Oct Nov Dec Total P (mm) 148 123 103 67 55 50 35 35 61 119 152 170 1118 PE (mm) 13 21 29 39 58 76 89 82 70 50 24 14 565 P − PE +135 +102 +74 +28 −3 −26 −54 −47 −9 +69 +128 +156 �ST 0 0 0 0 −3 −26 −54 −19 0 +69 +33 0 ST (mm) 102 102 102 102 99 73 19 0 0 69 102 102 AE (mm) 13 21 29 39 58 76 89 54 61 50 24 14 528 D (mm) 0 0 0 0 0 0 0 28 9 0 0 0 37 S (mm) 135 102 74 28 0 0 0 0 0 0 95 156 590 of stored soil water was started and complete utilization of the stored water causes a deficit of water from the middle of Augustto September. The perpendicular line (R) drawn on the graph (Fig. 2.29) represents the complete utilization of stored soil moisture. Again, precipitation exceeds potential evapotranspiration from the month of October and continues up to April. A part of this water surplus compensates for the loss of soil moisture or it triggers the recharge of water which is completed by the middle of November. The perpendicular line (U) drawn on the graph (Fig. 2.29) represents that the field capacity (102 mm) has been reached, i.e. the soil moisture has been fully restored. The excess water obtained after saturation level of the field capacity is attained and termed surplus of water (from the middle of November to April) (Fig. 2.29). Applicability of Water Balance Estimation 1. Water balance estimation is a useful method to assess the present status and trends of the availability of water resources in a region for a particular period of time. 2. Estimation of water balance assesses and improves the validity of visions, scenarios and various strategies which strengthens the procedures of decision- making for the proper management of water. 2.5.4.5 Hydrograph A streamflow or discharge hydrograph at any point on a stream is a graph showing the flow rate as a function of time at that point. In this graph, the discharge is plotted on the y-axis (ordinate) and time is on the x-axis (abscissa) (Fig. 2.30). The units of time may be in minutes, hours or days, and the rate of flow (discharge) is generally expressed in cubic meters per second (cumec) or cubic feet per second (cusec). Thus, the hydrograph is an important graphical representation of the topographic 96 2 Representation of Geographical Data Using Graphs Fig. 2.29 Water balance curve of a sample study area and climatic characteristics which control the inter-relationship between rainfall and run-off of a particular drainage basin (Chow 1959). Though two types of hydrographs are particularly important: the annual hydrograph and the storm hydrograph, but it is more useful in hydrology to consider a hydrograph for a certain storm event (storm hydrograph). A storm hydrograph reflects the influence of all physical characteristics of the river basin and, to some extent, also reflects the characteristics of the storm causing the hydrograph. A hydrograph can be considered a thumbprint of a drainage basin. The shape of a hydrograph mainly depends on the rate of transfer of water from 2.5 Types of Graphical Representation of Data 97 Fig. 2.30 Elements of a hydrograph different parts of the river basin to the gauge station. No two river basins produce the same hydrographs for the same storm. Hydrographs from similar river basins may be similar, but not the same. In the same way, no two storms generate identical hydrographs from the same basin. Elements of the Hydrograph The elements of the hydrograph are: (1) Rising limb: As surface run-off reaches the gauge station, the water begins to increase in the channel. With the progress of time, more and more surface run-off reaches the gauge and the water in the channel continues to rise until it reaches a maximum discharge, recorded as the maximum gauge height. The rising portion of the hydrograph indicated by the rising stage is called the rising limb (Fig. 2.30). The rising limb graphically represents increasing discharge over time as the limb rises and discharge increases. (2) Crest and peak discharge: The time interval of the greatest discharge at the peak of the hydrograph is called the crest (Fig. 2.30). It may be of a short time duration represented as a sharp peak or of a fairly long time duration represented as a flat peak. The crest does not necessarily represent an equal volume of discharges, rather it represents a zone having nearly equal highest 98 2 Representation of Geographical Data Using Graphs discharges. The greatest discharge within the crest is called peak discharge (Fig. 2.30) and it is of primary interest in hydrologic design. (3) Recession or falling limb: The segment of the hydrograph after the peak is called recession limb or falling limb (Fig. 2.30). It indicates the decrease of discharge as water is withdrawn from the river basin storage after rainfall ceases. The steepness of the recession limb represents the rate of draining of water from the basin area. (4) Point of inflection: The point on the recession limb indicating the end of storm flow (i.e. quick flow or direct run-off) and the return to groundwater flow (i.e. base flow) is known as the point of inflection (Fig. 2.30). In other words, it is the point on the recession limb of the hydrograph where the steepness of the slope of the graph starts to decline. It indicates the point where the base flow becomes dominant to the total flow than the quick-response run-off. (5) Time to peak: It is the time interval from the beginning of the rising limb (beginning of the increase of discharge at the gauge station) to the peak discharge (Fig. 2.30). It is largely controlled by the characteristics of the drainage basin like travel distances, drainage density, channel slope, channel roughness, soil infiltration capacity etc. The distributional pattern of rainfall over the basin area is very important to alter the time to peak in a hydrograph. For example, the hydrograph of a storm rainfall occurring on the upper part of the basin area has a longer time to peak than for a storm rainfall occurring on the lower basin area. (6) Time of concentration: The time of concentration is the time needed for a drop of water falling on the most distant part of the river basin to reach the gauge station or the basin outlet (Fig. 2.30). It includes the time needed for all parts of the river basin to contribute run-off to the hydrograph and this time then indicates the highest discharge that can occur from certain storm intensity over the river basin area. (7) Lag time: Lag time is the time distance between the centre of mass of effective rainfall and the centre of mass of the direct run-off hydrograph. As the deter- mination of the centre of mass of the direct run-off hydrograph is difficult, lag time is also defined as the time distance between the centre of mass of effective rainfall and the peak of the direct run-off hydrograph (Fig. 2.30). It assumes uniform effective rainfall over the entire basin area. Factors Affecting Hydrograph Characteristics Several factors affect the characteristics of a stream hydrograph. These include: (1) Drainage characteristics: The characteristics of the drainage are primarily derived from the parent geology of the drainage basin and affect the charac- teristics of the streamflow as well as the hydrograph. Large drainage basins receive more precipitation (rainfall) than the smaller basin, so have a greater peak discharge in comparison to smaller basins. Generally, smaller basins have shorter lag times than the larger basins because rainwater does not have to 2.5 Types of Graphical Representation of Data 99 travel long distances. Circular basins lead to shorter lag times and a greater peak discharge than elongated basins because the water in the former has a shorter distance to travel to reach a river. River basins with a steep slope are likely to have a shorter lag time than the basins with a gentle slope because the water in the former flows more quickly down to the river. Basins with high drainage density drain more quickly, so have a smaller lag time. In a saturated river basin, the surface run-off increases and rainwater comes into the river more quickly which reduces the lag time. If the drainage basin is dominated by impermeable rock, infiltration will be reduced and surface run-off will be higher which increases the peak discharge and reduces the lag time. (2) Type and distribution of precipitation: The precipitation in the form of snow is likely to have a greater lag time rather than rainfall because snow takes time to be melted before the water reaches the river channel. Amount of rainfallis very important to control the nature of the storm hydrograph. Heavy storm rainfall results in more supply of water in the drainage basin leading to a higher discharge of water. (3) Soil and Land use: Soil type and land use pattern may alter the characteris- tics of the hydrograph. Forest removal, grass cutting, urbanization, farming, building of roads and any other structures reduce the rate of infiltration and increases the run-off. The presence of more vegetation in the basin area inter- cepts precipitation (rainfall) and slows the draining of water into river channels and so the lag time increases. (4) Human factors: Rapid rate of urbanization by the human being increases the concentration of impermeable materials on the surface which reduces the infiltration level and surface run-off increases. This results in an increase in peak discharge and a shorter lag time. Delineation of Run-Off Components in Storm Hydrograph A streamflow hydrograph of a specific storm is a hydrograph of total run-off. Compo- nents of streamflow are (a) direct run-off and (b) base flow. Direct run-off is again divided into surface run-off and quick interflow, whereas the base flow is also divided into delayed interflow and groundwater run-off. Surface Run-Off Surface run-off is that portion of the run-off water that travels over the ground surface to the stream channel (Fig. 2.31a). Most surface run-off flows to the first- order channels because they collectively drain the largest area of the drainage basin. Surface run-off includes that portion of the precipitation directly falling on the water flowing in the channel but overland flowdoes not include this portion of precipitation. 100 2 Representation of Geographical Data Using Graphs Fig. 2.31 Various components of run-off (after Singh 1994) Interflow or Sub-surface Flow It is the surface water that infiltrates the surface layer and moves laterally beneath the surface to a channel (Fig. 2.31c). Interflow can occur on forest floors, where the needles, leaves and other debris of the plants cover the ground surface. In interflow, the water is subject to higher flow resistance than the surface run-off. Because of this, the interflow water does not move as faster as surface run-off, hence delayed in reaching the stream channel. Direct Run-Off Direct run-off is considered to be the sum of surface run-off and the interflow. It is frequently equated with surface run-off. These two flow components move faster than groundwater flow and hence are often lumped together for hydrologic processes. Base Flow Base flow or groundwater flow is that component of the flow that contributed to the stream channel through groundwater (Fig. 2.31d). Groundwater occurs from surface water infiltration to the water table and then moving laterally to the stream channel through the aquifer. Such water moves very slowly than direct run-off and because of this reason it does not contribute to the peak discharge for a given storm hydrograph. Important components of streamflow can be easily separated and illustrated in a storm hydrograph. In Fig. 2.32, point ‘A’ marks the initiation of the surface run-off, which is believed to end at the change in slope shown as ‘B’; point ‘B’ is considered to be the initiation of interflow, which ends at point ‘C’; point ‘C’ marks the initiation of groundwater flow, which continues up to the end of the hydrograph. Therefore, the segment of the curve from A to B indicates the contribution of surface run-off, B to C indicates the contribution of interflow, especially the quick interflow and beyond ‘C’ it indicates the contribution of delayed interflow or base flow or groundwater flow. 2.5 Types of Graphical Representation of Data 101 Fig. 2.32 Important components of streamflow hydrograph 2.5.4.6 Rating Curve Rating curve (also called stage–discharge relation curve) is the graphical represen- tation of the relationship between stream stage and stream flow or discharge of water (cusec or cumec) (Fig. 2.33 and Table 2.20) for a given point on a stream, generally at gauging stations. In rating curve, measured discharge is usually plotted on the x-axis (abscissa) and measured stage on the y-axis (ordinate) (Fig. 2.33). Stream stage (also called gauge height or stage) means the height of the water surface (in feet) above a well-known elevation where the stage is zero. Though the zero stage is arbitrary, Fig. 2.33 Rating curve (Relationship between stream stage and discharge) 102 2 Representation of Geographical Data Using Graphs Table 2.20 Stream stage and discharge relationship Sl. no Gauge height (ft) Discharge (cubic ft/s) Sl. no Gauge height (ft) Discharge (cubic ft/s) 1 0.5 16 19 6.1 150 2 0.65 26 20 6.8 162 3 1.0 30 21 7.25 163 4 1.15 37 22 7.6 210 5 1.5 35 23 8.6 220 6 2.0 38 24 8.5 230 7 2.2 51 25 9.3 269 8 2.6 52 26 9.6 315 9 3.1 78 27 9.5 296 10 3.25 62 28 9.75 310 11 3.85 78 29 10.4 340 12 4.1 95 30 10.6 371 13 4.5 96 31 10.7 367 14 4.65 104 32 10.75 384 15 4.9 120 33 10.9 401 16 5.0 103 34 10.95 406 17 5.6 122 35 11.1 430 18 5.8 132 36 11.2 407 generally it is close to the streambed. Stream discharge is measured several times over a range of stream stages. These measurements are done over a period of months or years for the establishment of an accurate relationship between the discharge and gauge height at the gauging station. Additionally, these relationships must constantly be verified against ongoing stream flow or discharge measurements because stream channels are always changing. Controls of Rating Curve Rating curve represents the integrated result of a wide variety of channel and flow parameters. The combined effect of these parameters is considered as control. If the rating curve (stage–discharge relationship) remains the same with time, then it is called permanent control. But if the relationship does change with time, it is called shifting control. When the control of a gauging station changes, the rating curve also changes. The changemay be caused by (1) erosion or deposition, (2) rapidly changing flow, (3) varying backwater, and (4) changes in the flow because of dredging, channel encroachment and weed growth. For the shifting control due to cases (1) and (4), frequent current meter gauging is required (Singh 1994). Bedrock-bottomed parts of streams or metal/concrete structures or weirs are generally, though not always, thought of as permanent controls. 2.5 Types of Graphical Representation of Data 103 Steps of Development of Rating Curve The development of rating curve generally involves three steps: 1. Measuring stream stage A continuous record of the stream stage, i.e. the height of the water surface at a specific location along a stream is taken. Various methods are used tomeasure the stream stage or gauge height. A common method is with a stilling well in the river bank or fixed to a bridge pier.Water from the stream comes into and leaves the stilling well through the underwater pipes which allow the water surface in the stilling well to be at the same height as the water surface in the stream. The height of the water surface inside the stilling well can be easily measured using pressure or float optic or acoustic sensor. The measured stream stage values are stored in an electronic data recorder device at a regular time interval, generally every 15 min interval. A stilling well is not cost-effective to install; stream stage can also be determined by the measurement of the pressure needed to maintain a little flow of gas through a tube and bubbled out at a specified location below water in the stream. The pressure is directly linked to the height of the water column over the tube outlet in the stream. More the height of water above the tube outlet, more pressure is needed to push the gas bubbles through the tube. 2. The discharge measurement Discharge of water (the volume of water passing a specific location along a stream per unit of time) is measured periodically. Generally, streamdischarge is estimated by multiplying the area of stream cross-section by the average water velocity in that cross-section: Discharge of water = area of the cross − sect i on × average water veloci t y (2.16) Numerousmethods and types of equipment are used tomeasure the water velocity and cross-sectional area, including the water current meter and acoustic Doppler current profiler. The current meter is used to measure the velocity of water at prefixed places (sub- sections) along a specified line, like a bridge or suspended cableway across a stream or river. In this technique, the stream cross-section is divided into several numbers of vertical sub-sections. The area of each sub-section is obtained by multiplying its width and depth, and the water velocity is measured using the current meter. Then the discharge of water in each sub-section is computed by multiplying the sub-section area and the measured water velocity. The total discharge in a cross-section is then calculated by summing the discharge of each sub-section. Acoustic Doppler current profiler (ADCP) can also be used to measure the water discharge. An ADCP uses the Doppler effect to measure the water velocity. ADCP transmits a sound pulse into the stream water and detects the shift in the frequency 104 2 Representation of Geographical Data Using Graphs of that pulse reflected back to the receiver of ADCP by sediment or other particulate matters transported in the water. The change in frequency or Doppler shift, which is detected by the ADCP, is then converted into water velocity. The discharge of water is then calculated by multiplying the cross-section area with the measured water velocity. 3. The stage–discharge relation Identification of the natural and continuously changing relationship between the stream stage and discharge can be done by applying the stage–discharge relation to transform the frequently measured stream stage into estimates of discharge (USGS Science for a changing world). Simple Rating Curve The representation of measured values of stream stage and water discharge on an arithmetic scale results in a parabolic curve, which can be expressed as Q = a(h − b)c (2.17) where b is a constant indicating the gauge reading for zero discharge, a and c are rating curve constants. When the measured stream stage and water discharge data are represented on a logarithmic scale, a straight line results and the Eq. (2.17) becomes log Q = log a + c log(h − b) (2.18) The values of constants a and c can be obtained using the least squaremethod. The value of constant b must be calculated beforehand and this can be done in different ways. A trial-and-error method can be used to get the value of b, which then gives the best-fit curve. Another way is to extrapolate the rating curve corresponding to Q = 0 and then plot log Q versus log(h − b). If the plotting of the values gives a straight line then the value of b obtained by extrapolation is correct and acceptable. Otherwise, another value in the vicinity of the previous value of b is chosen and the same procedure is repeated (Singh 1994). The value of b can also be computed analytically. From a smooth curve of Q versus h, the values of discharge like Q1, Q2 and Q3 are selected in such a way that Q1 Q2 = Q2 Q3 . The corresponding values of the stage are h1, h2 and h3. Then we have (h1 − b)c (h2 − b)c = (h2 − b)c (h3 − b)c (2.19) or h1 − b h2 − b = h2 − b h3 − b (2.20) 2.5 Types of Graphical Representation of Data 105 from which the value of b is derived as b = h1h3 − h2 2 h1 + h3 − 2h2 (2.21) Alternatively, the values of these parameters a, b and c can be obtained by optimization. Generally, simple rating curve is satisfactory formost of the streams inwhich rapid fluctuations of stream stage are not experienced at the gauging section. The adequacy of the curve ismeasured by the scattering of values around the fitted curve. If there is a permanent control, the rating curve is basically permanent. In some gauging stations, theremay be two ormore controls each for a specific range of stream stage. The rating curve in such a station is discontinuous; the point of discontinuity corresponds to the stream stage revealing the change in control. An example is when submergence of a weir control starts when the tailwater level below the control rises above the lowest point of the control. Even in such situations, the simple rating curve may well be acceptable if the control is permanent, free of backwater and the slope of the streams is steep. Uses of Rating Curve Continuous measurement of stream gauges provide streamflow/discharge informa- tion which can be used for different purposes including. (i) Flood prediction (ii) Water management and allocation (iii) Engineering design (iv) Research purposes (v) Operation of locks and dams (vi) Recreational safety and enjoyment etc. 2.5.4.7 Lorenz Curve and Gini Coefficient The concentration or inequality in the distribution of any phenomenon or attribute or variable with respect to others can be studied in several different methods like (1) Lorenz curve and Gini coefficient, (2) location quotient and (3) index of dissimilarity (Mahmood 1999). The Lorenz curve is the graphical method and the Gini coefficient is the numerical or mathematical method of measurement of the degree of inequality of different phenomena or variables. Lorenz curve (or Pareto curve), a form of percentage cumulative frequency curve, was first developed in 1905 by Max Otto Lorenz, an American economist for repre- senting the inequality in the distribution of wealth or income. It is an effective graph- ical measure of inequality in the distribution of various items in social science, especially in geography, like studies on landholdings, income, expenditure, wealth, 106 2 Representation of Geographical Data Using Graphs economic activities etc. (Pal 1998). It basically deals with the cumulative percentage distributions of the two attributes or variables at different points. For example, for the representation of income of the people in a country, this curve appears as a graph of population shares against their income shares, ordered from poorest to richest. Techniques of Drawing of Lorenz Curve (1) At first, both the variables are expressed in percentages (%), arranged according to ascending or descending order and their cumulative percentages are calcu- lated (Tables 2.21, 2.22 and 2.23). Cumulative percentages of one variable (independent) are plotted on the x-axis and cumulative percentages of other variable (dependent) are plotted on the y-axis. For example, in the study of the number and area of landholdings, the cumulative percentage of the number of land holdings is plotted on the x-axis and the cumulative percentage of the area of landholdings is plotted on the y-axis (Fig. 2.34). Similarly, in the study of income distribution of the people, the cumulative percentage of population is plotted on the x-axis and the cumulative percentage of income is plotted on the y-axis (Fig. 2.36). (2) Cumulative percentages of one variable up to certain points are plotted on a graph against the cumulative percentages of the other variable up to the same points. The different points so obtained are then joined by a smooth freehand curve, known as Lorenz curve. (3) For comparison, a diagonal line at an angle of 45° is also drawn, joining the point of origin or lower-left corner (x = 0% and y = 0%) and the last point or upper-right corner (x = 100% and y = 100%) of the graph. This line is called ‘line of equal distribution’ (Figs. 2.34, 2.35 and 2.36). Lorenz curve will never cross the line of equal distribution. How to Read the Lorenz Curve (1) If 1% share of ‘x’ variable corresponds to 1% share of ‘y’ variable, 50% share of ‘x’ variable corresponds to 50% share of ‘y’ variable and n% share of ‘x’ variable corresponds to n% share of ‘y’ variable, then this is the condition of equal distribution of two variables. Thus the Lorenz curve in the idealcase would be a straight line (equal distribution line). For example, if we want to understand the degree of inequality in the distribution of income of the people in a region or country and if 1% of the population has 1% of the total income, 50% of the population has 50% of the total income and n% population has n% of the total income, then the distribution of income among the people is perfectly equal. (2) The deviation of any Lorenz curve from the equal distribution line is in propor- tion to the degree of inequality in the distribution of one variable in relation to 2.5 Types of Graphical Representation of Data 107 Ta bl e 2. 21 W or ks he et fo r L or en z cu rv e (T he nu m be r an d ar ea of la nd ho ld in gs ) Si ze of la nd ho ld in gs (h ec ta re s) N o. of la nd ho ld in gs (i n m ill io ns ) A re a of la nd ho ld in gs (i n m ill io n he ct ar es ) % of no .o f la nd ho ld in gs to to ta l no .o f la nd ho ld in gs % of ar ea of la nd ho ld in gs to to ta l ar ea of la nd ho ld in gs C um ul at iv e % of no .o f la nd ho ld in gs to to ta l no .o f la nd ho ld in gs (X i) C um ul at iv e % of ar ea of la nd ho ld in gs to to ta l ar ea of la nd ho ld in gs (Y i) X i .Y i+ 1 Y i .X i+ 1 < 2 28 48 26 .1 7 6. 64 26 .1 7 6. 64 45 6. 14 31 0. 29 2– 4 22 78 20 .5 6 10 .7 9 46 .7 3 17 .4 3 12 92 .5 5 11 07 .6 8 4– 6 18 74 16 .8 2 10 .2 3 63 .5 5 27 .6 6 29 00 .4 2 21 71 .3 1 6– 10 16 13 0 14 .9 5 17 .9 8 78 .5 45 .6 4 52 11 .6 1 40 51 .9 2 10 –1 5 11 15 0 10 .2 8 20 .7 5 88 .7 8 66 .3 9 76 74 .1 4 63 90 .7 0 15 –2 0 8 14 5 7. 48 20 .0 5 96 .2 6 86 .4 4 92 52 .5 1 84 62 .7 5 20 –2 5 3 70 2. 80 9. 68 99 .0 6 96 .1 2 99 06 96 12 > 25 1 28 0. 93 3. 87 10 0 10 0 ∑ 10 7 72 3 10 0% 10 0% 36 ,6 93 .3 7 32 ,1 06 .6 5 108 2 Representation of Geographical Data Using Graphs Ta bl e 2. 22 W or ks he et fo r L or en z cu rv e (T ot al an d ur ba n po pu la tio n of si x N or th B en ga ld is tr ic ts of W es tB en ga l) N am e of th e di st ri ct s To ta lp op ul at io n (T P) U rb an po pu la tio n (U P) % of U P to T P % of T P to G ra nd T P (1 ) % of U P to To ta lU P (2 ) A sc en di ng or de r of % of U P to T P A rr an ge m en to f 1 (3 ) A rr an ge m en to f 2 (4 ) D ar je el in g 1, 84 2, 03 4 71 8, 17 5 38 .9 9 10 .7 1 22 .4 3 10 .2 5 16 .4 1 9. 03 Ja lp ai gu ri 3, 86 9, 67 5 1, 04 4, 67 4 27 .0 0 22 .4 9 32 .6 2 12 .0 7 17 .4 4 11 .3 1 K oc h B ih ar 2, 82 2, 78 0 28 9, 30 0 10 .2 5 16 .4 1 9. 03 13 .8 0 23 .2 4 17 .2 3 M al da 3, 99 7, 97 0 55 1, 91 4 13 .8 0 23 .2 4 17 .2 3 14 .1 3 9. 71 7. 37 U tta r D in aj pu r 3, 00 0, 84 9 36 2, 18 7 12 .0 7 17 .4 4 11 .3 1 27 .0 0 22 .4 9 32 .6 2 D ak sh in D in aj pu r 1, 67 0, 93 1 23 6, 07 5 14 .1 3 9. 71 7. 37 38 .9 9 10 .7 1 22 .4 3 To ta l 17 ,2 04 ,2 39 3, 20 2, 32 5 10 0% 10 0% 10 0% 10 0% C um ul at iv e % of 3 (X i) C um ul at iv e % of 4 (Y i) X i ·Y i+ 1 Y i· X i+ 1 16 .4 1 9. 03 33 3. 78 30 5. 66 33 .8 5 20 .3 4 12 71 .7 4 11 61 .2 1 57 .0 9 37 .5 7 25 65 .6 2 25 09 .6 8 66 .8 44 .9 4 51 81 .0 1 40 12 .6 9 89 .2 9 77 .5 6 89 29 77 56 10 0 10 0 ∑ 18 ,2 81 .1 5 15 ,7 45 .2 4 So ur ce C en su s of In di a (2 01 1) 2.5 Types of Graphical Representation of Data 109 Table 2.23 Inequality in the distribution of income of people of Sweden, USA and India Decile % Income of people Cumulative % income of people Sweden USA India Sweden USA India 1 3.3 1.9 0.6 3.3 1.9 0.6 2 6.2 3.8 1.2 9.5 5.7 1.8 3 7.4 5.5 3.0 16.9 11.2 4.8 4 8.4 6.8 5.1 25.3 18.0 9.9 5 9.3 8.2 6.2 34.6 26.2 16.1 6 10.2 9.5 6.9 44.8 35.7 23 7 11.1 11.2 7.2 55.9 46.9 30.2 8 12.3 13.3 9.4 68.2 60.2 39.6 9 13.7 16.1 25.4 81.9 76.3 65.0 10 18.1 23.7 35.0 100 100 100 Sources Statistics Sweden, online database (2014), U.S. Census Bureau, Historical Income Tables (2016); Credit Suisse’s Global Wealth Databook (2014) Fig. 2.34 Lorenz curve showing the inequality in the distribution of number and area of land holdings 110 2 Representation of Geographical Data Using Graphs Fig. 2.35 Lorenz curve showing the inequality in the distribution of total and urban population the other. Further, this Lorenz curve is from the line of equal distribution, so greater is the inequality. If the Lorenz curve coincides with the line of equal distribution, it indicates ‘0’ inequality. But, if the curve coincides with ‘x’ and ‘y’ axes then it indicates the maximum (unity or 100%) inequality. Gini Coefficient (G) The inequality in the distribution of any phenomenon is numerically measured by an index known as ‘Gini coefficient’, developed by the Italian statistician Corrado Gini in the year 1912. Gini coefficient is the ratio of the area under the Lorenz curve and the equal distribution line to the area of the triangle formed by the x-axis, y- axis and the equal distribution line. Graphically, the Gini coefficient is defined as a ratio of two areas occupying the summation of all vertical differences between the Lorenz curve and the equal distribution line (‘A’ in Figs. 2.34 and 2.35) divided by the difference between the perfect equal distribution line and perfect inequality lines (‘A + B’ in Figs. 2.34 and 2.35). Therefore, the Gini coefficient can be defined as A A+B (shown in Figs. 2.34 and 2.35). In the case of uniform distribution, the Lorenz curve will fall on the line of equal distribution. Then the area between the curve and the line of equality would be zero (A = 0) and the value of Gini coefficient becomes 0 which means perfect equality. For the distribution with maximum concentration or 2.5 Types of Graphical Representation of Data 111 Fig. 2.36 Lorenz curve showing the inequality of income distribution of people in Sweden, USA and India SourcesStatistics Sweden, online database (2014), U.S. CensusBureau,Historical Income Tables (2016); Credit Suisse’s Global Wealth Databook (2014). inequality, the curve will coincide with ‘x’ and ‘y’ axes. Then the area of ‘B’ would be 0 (B = 0), i.e. the area between the Lorenz curve and the equal distribution line becomes equal to the area of the triangle formed by the x-axis, y-axis and the equal distribution line and the value of Gini coefficient becomes unity or 1, which means perfect inequality. Thus the value of Gini coefficient varies on a scale between zero (0) and unity (1 or 100%) [0 ≤ G ≤ 1]. The value ofGini coefficient (G) can be numerically calculated using the following formula: G = 1 100 × 100 ∣∣∣ ∑ (Xi.Yi+1) − ∑ (Yi.Xi+1) ∣∣∣ (2.22) where Xi and Yi are the cumulative percentage distributions of the two attributes. For the data shown in Table 2.21, G = 1 100 × 100 |36693.37 − 32106.65| G = 1 10000 |4586.72| 112 2 Representation of Geographical Data Using Graphs G = 4586.72 10000 G = 0.4586 G = 0.46 (Round off) The value of Gini coefficient (G) can also be worked out graphically using the following technique: In Fig. 2.34, the length of the straight line ‘xr’ is 5.9 cmand this length is equivalent to the inequality of unity (1). The line ‘xr’ cuts the Lorenz curve at the point ‘p’ and the distance ‘pr’ is 2.7 cm. Hence, the degree of inequality (G) can be calculated as pr xr = 2.7 cm 5.9 cm , i.e. G = 0.4576 = 0.46 (Round off). Uses of the Lorenz Curve and Gini Coefficient • Lorenz curve is the simplest representation and the Gini coefficient is the easiest measurement of inequality and can be interpreted easily. • It is the most effective measure in comparing the differences between two and more data distributions. • It can be easily applied to understand the change of distribution of any phenomenon within a country or region over a period of time, i.e. whether the inequality in the distribution is increasing or decreasing. • It displays the distribution of wealth or income of a country or region among the population withthe help of a graph. • It can be effectively used while introducing specificmeasures for the development of the weaker sections in the economy. • It can be applied to explain the fruitfulness of a government policy implemented to help the redistribution of income. Problems of Using Lorenz Curve and Gini Coefficient • Data restrictions, i.e. negative values cannot be represented. • It might not always rigorously be accurate for a finite population. • When two Lorenz curves intersect, it is difficult to ascertain which distribution illustrated by the curves represents more inequality. • Gini coefficient as a measure of index of concentration should not be compared with the degree of concentration of a phenomenon or activity in a region to which the ‘location quotient’ is concerned with. • Decision-makers and researchers are most interested in analysing inequality by a number. For the data given in Table 2.22, 2.5 Types of Graphical Representation of Data 113 G = 1 100 × 100 |18281.15 − 15745.24| G = 1 10000 |2535.91| G = 2535.91 10000 G = 0.2536 G = 0.25 (Round off) In Fig. 2.35, the length of the straight line ‘xr’ is 5.6 cmand this length is equivalent to the inequality of unity (1). The line ‘xr’ cuts the Lorenz curve at the point ‘p’ and the distance ‘pr’ is 1.4 cm. Hence, the degree of inequality (G) can be calculated as pr xr = 1.4 cm 5.6 cm , or G = 0.25. (a) Inequality in the distribution of income of population in Sweden In Fig. 2.36, the length of the straight line ‘px’ is 6.0 cm and this length is equivalent to the inequality of unity (1). The line ‘px’ cuts the Lorenz curve of Sweden at the point ‘q’ and the distance ‘pq’ is 1.0 cm. Hence, the degree of inequality (G) can be calculated as pq px = 1.0 cm 6.0 cm = 0.17. (b) Inequality in the distribution of income of population in the USA The line ‘px’ cuts the Lorenz curve of USA at the point ‘r’ and the distance ‘pr’ is 1.5 cm. Hence, the degree of inequality (G) can be calculated as pr px = 1.5 cm 6.0 cm = 0.25. (c) Inequality in the distribution of income of population in India The line ‘px’ cuts the Lorenz curve of India at the point ‘s’ and the distance ‘ps’ is 2.5 cm. Hence, the degree of inequality (G) can be calculated as ps px = 2.5 cm 6.0 cm = 0.42. Therefore, it is clear that the distribution of income of the people is more unequal in India (G = 0.42) compared to Sweden (G = 0.17) and the USA (G = 0.25). In India, the bottom 10% of the people possess only 0.6% of the total income of the country whereas it is 3.3% and 1.9% of total national income in the case of Sweden and the USA, respectively. The richest 10% of people in India enjoy 35% of the country’s national income which is almost double the income of the top 10% of people in Sweden (18.1%). In the USA, the top 10% of people have 23.7% of the total national income. In India, 50% of the people have only 16.1% of the total income of the country whereas it is 34.6% and 26.2% of total national income in the case of Sweden and the USA, respectively. 114 2 Representation of Geographical Data Using Graphs 2.5.4.8 Dispersion Graph It is observed that in any set of data the actual values differ from each other and from the mean or average value also. The measurement and analysis of this spread-out character of the data set is called ‘dispersion’. In other words, dispersion indicates the degree of heterogeneity among the values in a data set. More the heterogeneity among the values, the more the degree of dispersion. Dispersion is as characteristic as the similarity is in statistics. It can be measured using two methods: (1) by means of the distances between specific observed values and (2) by means of the average deviations of individual observed values about the central value (Pal 1998). When dispersion is measured in terms of the difference between the highest and the lowest values of the observations in a data set then it is called ‘range’. It is used when the values in a data set form distance between points and individuals and when it is arranged graphically in terms of their magnitude, then a ‘dispersion graph’ is obtained. The total spread of the data within the range can be obtained from this graph. Symbolically speaking, Coefficient of range = L − S L + S (2.23) where L and S are the largest and smallest values, respectively. Dispersion graphs are normally used to show the most important pattern in the distribution of the data set. The graph displays each value plotted as an individual point on a vertical scale. It shows the range of data and the distribution of each individual value within that range. Rainfall dispersion graph is one in which each year seasonal and annual amounts of rainfall are represented by placing a point against a vertical scale to enable to observe at a glance the span of dry years, normal years and wet years over a period of time. Methods of Construction of Rainfall Dispersion Graph • At first the fundamental values like median (Q2), upper quartile (Q3) and lower quartile (Q1) etc. of the given data set are obtained using suitable formula. • To obtain these values all the observations are arranged into ascending order. To obtain the value of the lower quartile (25% observations are smaller and 75% observations are larger than this value), consideration should be taken from the lower end and in the case of the upper quartile (75% observations are smaller and 25% observations are larger than this value) consideration should be taken from the upper end of the data set. In our given example with annual rainfall data for 30 years, the lower quartile (Q1) will lie at the 7.75th position of the series, and the upper quartile (Q3) will lie at 23.25th position (Table 2.24). The 15.5th value reckoned from the bottom or top of the graph indicates the median or Q2 which divides the entire data set into two equal halves (Fig. 2.37). 2.5 Types of Graphical Representation of Data 115 Table 2.24 Calculations for rainfall dispersion graph (Annual rainfall of Bankura district, year 1976–2015) Rainfall in ascending order Position Rainfall in ascending order Position 1040 1 1620 16 1062 2 1645 17 1092 3 1678 18 1092 4 1690 19 1129 5 1765 20 1156 6 1780 21 1280 7 1828 22 1288 8 1856 23 1290 9 1856 24 1290 10 1876 25 1452 11 1959 26 1467 12 1975 27 1560 13 1993 28 1569 14 2014 29 1595 15 2128 30 Lower quartile (Q1) 1286 mm (7.75th position) Median (Q2) 1607.5 mm (15.5th position) Upper quartile (Q3) 1856 mm (23.25th position) Co-efficient of Quartile Deviation 0.1814 • A suitable vertical scale is selected and then each value of the data set (annual rainfall) are plotted graphically as individual points for the whole of the period under consideration on that vertical scale (Fig. 2.37). • The central value (the median value, Q2) is selected and this is displayed on the graph. Similarly the values of Q1 and Q3 are also displayed on the graph to understand the dispersion or variability among the observations within the graph. Figure 2.37 illustrates the 30 years of annual rainfall distribution of Bankura district in a dispersion graph. Advantages • Easy to understand visually. • Represents the spread of data from the mean and conveys much more information than other graphs drawn on mean values alone. • Can find out the values of range, mean, median, mode, lower quartile, upper quartile and inter-quartile range. • Can show the anomalies in the data set. • It makes it possible to compare the variability of two or more sets of data. 116 2 Representation of Geographical Data Using Graphs Fig. 2.37 Rainfall dispersion graph of Bankura district (year 1976–2015) Disadvantages • Better to work with lots of data. • Sometimes the important features of the rainfall distribution may not be shown. 2.5 Types of Graphical Representation of Data 117 2.5.4.9 Rank-Size Graph The rank-size-rule or rank-size relationship is an empirical and practical regularity of city size distribution observed in theurban systems inmany countries in the world. It is an important method for analysing the total settlement networks in a region or country. Hence, it is a technique for understanding the national settlement system and facilitate the depiction and interpretation of the relationship between the population size and rank of the urban places. The rank-size-rule was first identified by Auerbach in 1913 but postulated and popularized by G.K. Zipf in 1949 in his book ‘Human behaviour and the principle of least effort’.After that,manygeographers have studied the size distribution of settlements and described the relationship between the number and size of the settlements in geographical form. In its general form, the rule states that, if all urban areas or cities in a country or region are ranked according to the population size with the largest city having the first rank, then the population of any urban area or city multiplied by its rank will equal the population of the first ranking city (largest city). In other words, the population of a city or urban area (Pr ) of rank r can be calculated by dividing the population of the first ranking city (P1) by its rank. Symbolically, it can be written as Pr = P1 r (2.24) where Pr is the population of ‘r’ ranking city; P1 is the population of the first ranking city and r is the rank of the city. Accordingly, the second-ranking city of a country or region has half of the popu- lation of the largest city; the third-ranking city has one-third of the population of the largest city and so on down the scale (Table 2.25). The relationship can be represented graphically by plotting the population of the city with respect to its rank. If rank (on ‘x’-axis) and population (on ‘y’-axis) are plotted using arithmetic scale then a curve (inverted J-shape) results (Fig. 2.38). But plotting of the population against the rank following logarithmic scale will produce a straight line (Fig. 2.39). When logarithmic scales are considered along both the axes then the equation can be rewritten as log Pr = log ( P1 r ) (2.25) log Pr = log P1 − log r (2.26) Rank-Size Graph According to Zipf (1949) 118 2 Representation of Geographical Data Using Graphs Fig. 2.38 Rank-size graph according to G.K. Zipf (arithmetic scale) Fig. 2.39 Rank-size graph according to Pareto (logarithmic scale) 2.5 Types of Graphical Representation of Data 119 G.K. Zipf used the method, shown in Table 2.25 to compute the expected population of different cities or urban areas in a country or region. Rank-Size Graph According to Pareto According to this rule, the relation between the population of a town or city and its rank can be expressed as follows (Pareto’s distribution): Pr = K · r−b (2.27) where Pr is the population of the ‘r’ ranking city. K and b are the constants. The above equation gets transformed into the following linear form after taking the logarithm on both sides: log Pr = log(K · r−b) (2.28) log Pr = log K − b log r (2.29) Table 2.25 Rank-size relationship of Indian cities (according to G.K. Zipf method) City Total (actual) population (2011 census) Rank (r) 1 rank (r) Expected or estimated total population [∑ Total population ∑ 1 r ] × 1 r Mumbai 18,394,912 1 1 31,898,746 Delhi 16,349,831 2 0.5 15,949,373 Kolkata 14,035,959 3 0.33 10,526,586 Chennai 8,653,521 4 0.25 7,974,687 Bangalore 8,520,435 5 0.20 6,379,749 Hyderabad 7,674,689 6 0.17 5,422,787 Ahmedabad 6,361,084 7 0.14 4,465,824 Pune 5,057,709 8 0.125 3,987,343 Surat 4,591,246 9 0.11 3,508,862 Jaipur 3,046,163 10 0.10 3,189,875 Kanpur 2,920,496 11 0.09 2,870,887 Lucknow 2,902,920 12 0.08 2,551,900 Nagpur 2,497,870 13 0.077 2,456,203 Ghaziabad 2,375,820 14 0.07 2,232,912 Indore 2,170,295 15 0.067 2,137,216 Total 105,552,950 ∑ 1 rank = 3.309 120 2 Representation of Geographical Data Using Graphs This equation can be equated with the regression equation: Y = a − bX (2.30) where Y = log Pr ; X = log r ; a = log K . b = ∑ XY − ∑ X ∑ Y n ∑ X2 − ( ∑ X)2 n (2.31) b = 80.1987 − 12.11649×101.0695 15 11.40195 − (12.11649)2 15 b = 80.1987 − 1224.607586055 15 11.40195 − 146.8093299201 15 b = 80.1987 − 81.640505737 11.40195 − 9.78728866134 b = −1.441805737 1.61466133866 b = −0.8929462188 a = Y − bX (2.32) a = 101.0695 15 − (−0.8929462188) 12.11649 15 a = 6.73796666667 − (−0.8929462188) × 0.807766 a = 6.73796666667 + 0.72129159538 a = 7.45925826205 Thus, log Pr = 7.45925826205 − 0.8929462188 log r . Here, a = log K = 7.45925826205 and b = 0.8929462188. So, log K = 7.45925826205. where K = Antilog of 7.45925826205. Hence, K = 28,791,100. So, the original equation (Eq. 2.27) can be written in the following form: Pr = 28791100.r−0.8929462188 (2.33) 2.5 Types of Graphical Representation of Data 121 Table 2.26 Rank-size relationship of Indian cities (according to Pareto method) City Total (actual) population (Pr ) Rank (r) X = log r X2 Y = log Pr XY Mumbai 18,394,912 1 0 0 7.2646977 0 Delhi 16,349,831 2 0.301029 0.090618 7.2135132 2.1714476 Kolkata 14,035,959 3 0.477121 0.227644 7.1472420 3.4100992 Chennai 8,653,521 4 0.602059 0.362475 6.9371928 4.1765993 Bangalore 8,520,435 5 0.698970 0.488559 6.9304617 4.8441848 Hyderabad 7,674,689 6 0.778151 0.605518 6.8850607 5.3576168 Ahmedabad 6,361,084 7 0.845098 0.714190 6.8035311 5.7496505 Pune 5,057,709 8 0.903089 0.815569 6.7039538 6.0542669 Surat 4,591,246 9 0.954242 0.910577 6.6619305 6.3570938 Jaipur 3,046,163 10 1 1 6.4837531 6.4837531 Kanpur 2,920,496 11 1.041392 1.084497 6.4654566 6.7330747 Lucknow 2,902,920 12 1.079181 1.164631 6.4628350 6.9745687 Nagpur 2,497,870 13 1.113943 1.240869 6.3975698 7.1265280 Ghaziabad 2,375,820 14 1.146128 1.313609 6.3758135 7.3074983 Indore 2,170,295 15 1.176091 1.383190 6.3365187 7.4523226 Total ∑ Pr = 10,55,52,950 ∑ r = 120 ∑ X = 12.11649 ∑ X2 = 11.40195 ∑ Y = 101.0695 ∑ XY = 80.1987 As per the rank-size relationship, by substituting ‘r’= 1, 2, 3, 4 etc. in Eq. (2.33), we get the estimated population of cities ranking 1st, 2nd, 3rd, 4th etc. The population of the top 15 cities in India are estimated based on the fitted rank-size relationship in the given Eq. (2.33) and the results are given in Tables 2.26 and 2.27. The graphical representation of the result is shown in Fig. 2.39. It must be noted that the differences between actual and estimated populations (shown in Table 2.27) have been calculated based on the relationship between 15 cities only. These differences may be reduced if more numbers of cities are taken into consideration in determining the relationship. Types of Deviations in Rank-Size Rule Three main types of deviations in the rank-size rule are as follows (Siddhartha and Mukherjee 2002): Primary Deviation The population of the second-largest city is less than half the population of the largest city, i.e. a condition for the development of primate city (the city having 122 2 Representation of Geographical Data Using Graphs Table 2.27 Expected populations and their deviations from actual populations City Rank (r) Actual population Estimated population Difference % Difference Mumbai 1 18,394,912 28,791,100 1,03,96,188 36.11 Delhi 2 16,349,831 15,504,389 −845,442 −5.45 Kolkata 3 14,035,959 10,794,800 −3,241,159 −30.02 Chennai 4 8,653,521 8,349,319 −304,202 −3.64 Bangalore 5 8,520,435 6,840,937 −1,679,498 −24.55 Hyderabad 6 7,674,689 5,813,143 −1,861,546 −32.02 Ahmedabad 7 6,361,084 5,065,603 −1,295,481 −25.57 Pune 8 5,057,709 4,496,219 −561,490 −12.49 Surat 9 4,591,246 4,047,352 −543,894 −13.44 Jaipur 10 3,046,163 3,683,935 637,772 17.31 Kanpur 11 2,920,496 3,383,378 462,882 13.68 Lucknow 12 2,902,920 3,130,454 227,534 7.27 Nagpur 13 2,497,870 2,914,518 416,648 14.29 Ghaziabad 14 2,375,820 2,727,894 352,074 12.91 Indore 15 2,170,295 2,564,909 394,614 15.38 Total ∑ r = 120 ∑ Pr = 10,55,52,950 108,107,950 25,55,000 −30.23 twice or more population than the next ranking city in the urban hierarchy) emerges (Fig. 2.40). Aprimate city is developed when few simple strong forces operate rather than many complex forces operating. The small size of the country, long colonial history, simple economic and political organization, dual economy etc. are the factors leading to the development of city primacy. Bangkok (Thailand), Lagos (Nigeria), Harare (Zimbabwe) etc. are some examples of the primate city. Binary Deviation Binary deviation emerges when the population of the second-largest city is more than half the population of the largest city. This situation is observed when a number of cities of almost similar size dominate the upper end of the hierarchy (Fig. 2.40). The high rate of industrialization, presence of more than one national identity, long history of urbanization etc. are the factors responsible for binary deviation in the rank-size relationship. For example, Madrid and Barcelona in Spain; Mumbai, Delhi and Kolkata in India. Stepped Pattern Deviation In stepped pattern deviation, not one but a number of cities may be observed at every level, each city resembling the others in population size and functioning (Fig. 2.40). 2.5 Types of Graphical Representation of Data 123 Fig. 2.40 Deviations in rank-size distribution 2.5.4.10 Box Plot (‘Box-And-Whiskers’) Graphs The concept of box-and-whiskers graph was first given by John Tukey in 1970. A box plot, also known as box-and-whisker plot, is an important graphical method to represent the spread and centres of a data set. Measures of spread consist of the inter-quartile range and the mean, whereas the measures of the centre include the average or mean and median (the middle-most value) of a data set. Box-and-whisker plot is a data display that allows seeing many attributes of a distribution at a glance, i.e. they can be a useful means for getting a quick summary of the data set. Elements of a Box-And-Whisker Plot Box plot is a convenient method for the graphical depiction of groups of numerical data using five number summaries: the minimum, the maximum, the median, the lower quartile and the upper quartile (Figs. 2.41 and 2.42). 1. Minimum: It is the lowest value in the data set excluding outliers, if any and shown at the far left of the plot, i.e. at the end of the left ‘whisker’. 2. Maximum: It is the largest value in the data set excluding outliers, if any and shown at the far right of the right ‘whisker’. 3. Median (Q2): It is the middle-most value of the data set and is represented as a line at the centre (middle) of the box. 124 2 Representation of Geographical Data Using Graphs Fig. 2.41 Box-and-whisker graph without outliers Fig. 2.42 Box-and-whisker graph with outliers 4. First (lower) quartile (Q1): It is the middle value between the smallest number (not always the minimum) and the median (Q2) of the data set and is shown at the far left of the box, i.e. at the far right of the left ‘whisker’. 5. Third (upper) quartile (Q3): It is the middle value between the largest number (not always the maximum) and the median (Q2) of the data set and is shown at the far right of the box, i.e. at the far left of the right ‘whisker’. 2.5 Types of Graphical Representation of Data 125 6. Inter-quartile range (IQR): It is the distance between the lower and upper quartile. IQR = Q3−Q1 = qn(0.75)−qn(0.25) (2.34) Methods of Construction In constructing this graph, at first we draw an equal interval scale and using this scale, a rectangular box is drawn with one end at the lower quartile (Q1) and the other end at the upper quartile (Q3). Then we draw a vertical line at the median value, i.e. at the second quartile (Q2). A distance of 1.5 times the IQR is measured out from the right of the upper quartile and a horizontal line is drawn up to the larger observed value from the given data set that falls within this distance. In the same way, a distance of 1.5 times the IQR is measured out from the left of the lower quartile and a horizontal line is drawn up to the lower observed value from the data set that falls within this distance. These two horizontal lines or segments are called the ‘whiskers’ (Figs. 2.41 and 2.42). All other observed values are plotted as outliers. The spacing between different divisions of the box specifies the amount of disper- sion or spread (degree of heterogeneity) and skewness (degree of symmetry or asymmetry) in the data set, and gives an idea about outliers. Example Without Outliers The monthly rainfall of 24 months was measured in mm and the values are given below: 52, 52, 52, 53, 58, 61, 61, 62, 62, 63, 64, 65, 65, 65, 66, 67, 68, 70, 70, 71, 72, 73, 74 and 76. A box-and-whisker plot can be constructed by calculating the five number summaries: minimum, maximum, median, lower quartile and upper quartile. The minimum is the smallest number in the given rainfall data set. Here, the minimum monthly rainfall is 52 mm. The maximum is the largest number in the given rainfall data set. Here, the maximum monthly rainfall is 76 mm. The median is the middle value of the ordered rainfall data set. This means that exactly 50% of the rainfall values are less than the median and 50% of the rainfall values are greater than the median rainfall. So, the median rainfall of the given data set is 65 mm. The lower quartile is a value in which exactly 25% of the values are less than this and 75% of the values are greater than this value. It can be easily calculated by finding the middle value between the minimum value and the median value. In this given data set, the value of the lower quartile (middle value between 52 and 65 mm) is 61 mm. 126 2 Representation of Geographical Data Using Graphs The upper quartile is a value in which exactly 75% of the values are less than this and 25% of the values are greater than this value. It can be easily calculated by finding the middle value between the median value and the maximum value. In this given data set, the value of the upper quartile (middle value between 65 and 76 mm) is 70 mm. The inter-quartile range (IQR) can be calculated using Eq. 2.34: IQR=Q3−Q1= 70 mm−61 mm = 9 mm Hence, 1.5 IQR 1.5 IQR = 1.5 × 9 mm = 13.5 mm 1.5 IQR after (above) the upper quartile is Q3+1.5 IQR=70 mm+13.5 mm =83.5 mm 1.5 IQR before (below) the lower quartile is Q1−1.5 IQR= 61 mm−13.5 mm = 47.5 mm. Here, the maximum rainfall in the data set is 76 mm and 1.5 IQR after (above) the upper quartile is 83.5 mm which indicates that the largest data set value is lower than 1.5 IQR after (above) the upper quartile. Therefore, the upper whisker will be drawn at the maximum rainfall value, i.e. 76 mm. In the same way, the minimum rainfall in the data set is 52 mm and 1.5 IQR before (below) the lower quartile is 47.5 mm, which indicates that the smallest data set value is greater than 1.5 IQR before (below) the lower quartile. Therefore, the lower whisker will be drawn at the minimum rainfall value, i.e. 52 mm (Fig. 2.41). Example with Outliers The above example is without outliers. But in this example outliers are incorporated by changing the first and last values of rainfall (in mm). 47, 52, 52, 53, 58, 61, 61, 62, 62, 63, 64, 65, 65, 65, 66, 67, 68, 70, 70, 71, 72, 73, 74 and 84. As all the values except the first and last values remain unchanged, the median, lower quartile and upper quartile remain the same. In this data set, the maximum rainfall value is 84 mm and 1.5 IQR after (above) the upper quartile is 83.5 mm, which indicates that the maximum rainfall is larger than 83.5 mm. So, the maximum rainfall value (84 mm) is an outlier. Therefore, the upper whisker will be drawn at the greatest rainfall value smaller than 83.5 mm, which is 74 mm. In the sameway, the minimum rainfall value is 47mm and 1.5 IQR before (below) the lower quartile is 47.5 mm, which indicates that the minimum rainfall is smaller than 47.5 mm. So, the minimum rainfall value (47 mm) is an outlier. Therefore, the 2.5 Types of Graphical Representation of Data 127 lower whisker will be drawn at the smallest rainfall value largerthan 47.5 mm, which is 52 mm (Fig. 2.42). 2.5.4.11 Hypsometric Curve or Graph Hypsometric (also called hypsographic) curve or graph is an important form of cumulative frequency curve or an Ogive. It is obtained by plotting the height of the contour with respect to the corresponding proportions of a specified unit area of the earth’s surface (say a drainage basin) (Pal 1998). Hypsometry, first described by Strahler (1952), involves the measurement and analysis of the relationship between height and basin area to understand the degree of dissection and stage of the cycle of erosion. The basic data required for the study of area–height relationship are areas between successive contours and their respective heights. The area may be measured with the help of planimeter or may be estimated by the intercept method. The height is obtained from the contour map. Area–height graph indicate actual areas between two successive contours and therefore the horizontal axis represents the area in terms of percentage of total area and the vertical axis represents the height. Hypsometric graph is generally used to show the proportion of the area of the surface at various elevations above or below a datum and thus the values of the area are plotted as ratios of the total area of the basin against the corresponding heights of the contours and hence the area is represented by cumulative proportion or percentage. A hypsometric curve is basically a graph representing the proportion of land area that exists at different heights by plotting relative area with respect to relative height. On our earth, the heights can take on either positive (above sea level) or negative (below sea level) values and are bi-modal due to the contrast between the continents and oceans. Hypsometry of the earth reveals that earth has two peaks in height, one for the continents and the other for the ocean floors (Fig. 2.43). From Table 2.28, it is clear that the total area of the whole basin (A) is 4830 km2 and the maximum height of the basin (H) is 575 m. The area–height relationship of the basin (Fig. 2.44) can be expressed dimensionlessly by computing the relative area (the ratio between the individual area between successive contours, ai and the whole area of the basin, A) and the relative height (the ratio of the mid-value of the contour height, hi to the maximum height of the drainage basin, H). These ratios are computed in Table 2.28 and represented on graph as shown in Figs. 2.45 and 2.46. Although the hypsometric curve representing the relation between the proportions of area (shown on x-axis or abscissa) and height (shown on y-axis or ordinate) essentially pass through X = 0, Y = 1 and X = 1, Y = 0 but its location on graph is a function of the stage of erosion of the basin concerned. 128 2 Representation of Geographical Data Using Graphs Fig. 2.43 Hypsometric curve for the whole earth Table 2.28 Calculations for area–height graph and hypsometric curve in a sample drainage basin (Fig. 2.44) Class intervals for height in metres Mid value (hi ) in metres Area between contours in sq. km.(ai ) hi H ai A Cumulative up: less than ai A ai hi <200 175 130 (2.69%) 0.30 0.027 0.027 22,750 200–250 225 260 (5.38%) 0.39 0.054 0.081 58,500 250–300 275 680 (14.08%) 0.48 0.141 0.222 187,000 300–350 325 1230 (25.47%) 0.56 0.255 0.477 399,750 350–400 375 1080 (22.36%) 0.65 0.224 0.701 405,000 400–450 425 760 (15.73%) 0.74 0.156 0.857 323,000 450–500 475 420 (8.70%) 0.83 0.087 0.944 199,500 500–550 525 120 (2.48%) 0.91 0.025 0.969 63,000 > 550 Say 575 150 (3.11%) 1.00 0.031 1.00 86,250 Total H = 575 ∑ ai = 4830 km2 (= A) ∑ ai hi = 1,744,750 Hypsometric Integral (HI) Hypsometric integral (HI) is the ratio of the volume or percentage of the total volume of the basin area below the curve (Fig. 2.46) and thus it indicates the volume of the basin area unconsumed by the dynamic wheels of erosion whereas erosion integral (EI) is a proportionate area above the curve and thus it reveals the volume of basin area which has already been consumed by the erosional processes. Thus hypsometric integral is the ratio between the area of the surface below the hypsometric curve (it 2.5 Types of Graphical Representation of Data 129 Fig. 2.44 Sample drainage basin showing height and area Fig. 2.45 Area–height relationship of the given drainage basin 130 2 Representation of Geographical Data Using Graphs Fig. 2.46 Hypsometric curve of the given drainage basin is between 0 and 1) and the area of the whole square (here it is 1). Theoretically, the value of hypsometric integral ranges between 0 and 1. Though the value of hypsometric integral can be calculated using different techniques, the following techniques are very popular and widely accepted. (1) Elevation–relief ratio (E) relationship method: When the spot heights of several numbers of places are known to us then the value of hypsometric integral is calculated using the following equation: E ≈ H I = Elmean − Elmin Elmax − Elmin (2.35) where E is the elevation–relief ratio equivalent to the hypsometric integral (H I ); Elmean is the weighted mean elevation of the entire drainage basin; Elmax and Elmin are the maximum and minimum elevations of the drainage basin, respectively. The weighted mean height of the drainage basin is calculated using the following formula: hc = ∑ ai hi∑ ai (2.36) where hc is the mean height of the drainage basin. In the given example (Table 2.28 and Fig. 2.44), ∑ ai hi = 1,744,750 and ∑ ai = 4830. So, the mean height of the drainage basin (hc) = 1744750 4830 = 361.23 m. 2.5 Types of Graphical Representation of Data 131 The value of Elmax and Elmin are 575 m and 175 m, respectively, then E ≈ H I = 361.23 − 175 575 − 175 = 186.23 400 = 0.4655 Hence, the value of hypsometric integral is 0.4655. (2) When co-ordinates of the points define the hypsometric curve (x, y; x = ai A and y = hi H ) (Fig. 2.46), HI is found using the following equation: H I = ∣∣∑ xi yi+1 − ∑ yi xi+1 ∣∣ 2 (2.37) (3) Mathematically, the hypsometric integral can be found from the integral calculus as integral f = Volume Total height × Total area 1.0∫ 0.0 a.Δh (2.38) where a is area and Δh is the range in height h. Importance of Hypsometric Curve and Hypsometric Integral The value of hypsometric integral has been accepted as an important morphometric indicator of the stage of erosion of the basin. According to Strahler (1952), a high integral value exceeding 0.60 indicates the youthful stage in the development of drainage basin (denudation processes are not keeping pacewith the rate of upliftment, i.e. much of the rock volume in the basin is still to be eroded); the value in between 0.35 and 0.60 indicates the mature or equilibrium stage and the value less than 0.35 indicates the old erosional surface or monadnock stage of the basin (Fig. 2.47). The hypsometric integral is a dimensionless parameter and hence allows different drainage basins to be compared irrespective of scale. The shape of hypsometric curve and the value of hypsometric integral act as important indicators of basin conditions and characteristics. Hypsometric integral values are associated with the degree of disequilibrium in the balance between tectonic forces and the degree of erosion. Hypsometric integral is considered the most useful technique for the study of active tectonics. A useful aspect of the hypsometric curve is that drainage basins having different sizes can be easily compared with each other since an area elevation is represented as 132 2 Representation of Geographical Data Using Graphs Fig. 2.47 Understanding the stages of landform development using hypsometric curve functions of total area and total elevation, i.e. the hypsometric curve is independent of variations in basin size and relief (Alhamed and Ahmad Ali 2017). It may be noted that the low value of hypsometric integral (below 0.30) is only sustained as long as few monadnocks give a relatively large differencein height between the highest and the lowest places. But when the monadnocks are eroded, the integral returns to about 0.40–0.60. It may be pointed out that hypsometric integral is a very delicate morphometric measure and hence it should be used with the greatest care and field verifications, otherwise it may render ambiguous results. 2.5.5 Frequency Distribution Graphs 2.5.5.1 Histogram Themost common and simple form of graphical representation of grouped frequency distribution is the histogram. It is constituted by a set of adjoining rectangles drawn on a horizontal baseline, having areas directly proportional to the class frequencies (Das 2009). Generally, the class boundaries are shown along the x-axis (abscissa) and the numbers of frequencies are represented along the y-axis (ordinate). As the class boundaries are taken into account to represent the rectangles, these become continuous to each other. In constructing a histogram, the fundamental principle is that the area of each rectangle is directly proportional to the class frequency ( fi ). Hence, 2.5 Types of Graphical Representation of Data 133 Area of a rectangle (A) ∞ fi (2.39) or hi × wi = k. fi [Area of a rectangle (A) = height × width] (2.40) where hi is the height of the rectangle for the i th class;wi is the width of the rectangle for the i th class; k is constant of proportionality. or hi = k wi fi (2.41) Equation 2.41 has distinctive applications on grouped frequency distribution with equal class size and unequal class size like (Sarkar 2015). Grouped Frequency Distribution with Equal Class Size In a grouped frequency distribution where all the classes are of equal size (wi ), Eq. 2.41 becomes hi = ( k w ) fi (2.42) or hi = ki × fi ; [ki = k w = constant because all the classes have equal size] (2.43) that is hi ∞ fi (2.44) So, in the case of the frequencydistributionhaving the sameclass size, the height of each rectangle is directly proportional to its class frequency and it is then customary to take the heights numerically equal to the class frequencies (Table 2.29 and Fig. 2.48). Grouped Frequency Distribution with Unequal Class Size In a grouped frequency distribution where the classes are of different sizes (wi ), Eq. 2.41 becomes 134 2 Representation of Geographical Data Using Graphs Table 2.29 Grouped frequency distribution with equal class size (average concentration of SPM in the air) Class boundary Class mark (xi ) Class width (wi ) Frequency ( fi ) 170.5–220.5 195.5 50 9 220.5–270.5 245.5 50 4 270.5–320.5 295.5 50 2 320.5–370.5 345.5 50 6 370.5–420.5 395.5 50 3 420.5–470.5 445.5 50 1 Fig. 2.48 Histogram (average concentration of SPM in the air) hi = k ( fi wi ) (2.45) or hi = k × fdi [ fdi is the frequency density of the i th class] (2.46) that is, hi ∞ fdi (2.47) 2.5 Types of Graphical Representation of Data 135 Table 2.30 Grouped frequency distribution with unequal class size (monthly income of families) Class boundary (Monthly income in Rs.’00) Class mark (xi ) Class width (wi ) Frequency ( fi ) [no. of family] Frequency density ( fdi ) [ fi wi ] 0–50 25 50 40 0.8 50–120 85 70 60 0.86 120–250 185 130 45 0.35 250–350 300 100 35 0.35 350–600 475 250 25 0.1 600–950 775 350 20 0.057 Fig. 2.49 Histogram (monthly income of families) So, in the case of classes having unequal width, the rectangles will also be unequal inwidth and thus their heightsmust be directly proportional to the frequency densities but not to the class frequencies (Table 2.30 and Fig. 2.49). Therefore, in unequal class size, the rectangles of the histogram must be drawn with respect to the frequency densities of the classes. 136 2 Representation of Geographical Data Using Graphs Uses of Histogram 1. A series of rectangles or a histogram gives a visual description of the relative size of different groups of a data series. The entire distribution of total frequency into different classes becomes easy to understand at a glance. 2. The surface structure of the top of the rectangles gives an approximate idea about the nature (average, spread and shape etc.) of the frequency distribution and the frequency curve. 3. It is generally used for the graphical representation of mode. 4. Numerous geographical, economical and social data can be easily represented by histogram for their better and fruitful understanding. 2.5.5.2 Difference Between Historigram and Histogram Historigram and histogram are two important methods of graphical representation of statistical or geographical data. The major differences between these two are as follows: Historigram Histogram 1. Representation of classified and summarized time series data by line is called historigram 1. Histogram is a set of adjoining rectangles drawn on a horizontal baseline 2. It represents the change of values of different variables with time 2. It represents the distribution of frequencies (number of observations) in different measurement classes 3. In historigram, time (year, month, day etc.) is shown along the x-axis and the values of the variable are shown along the y-axis 3. In histogram, the class boundaries are shown along the x-axis (abscissa) and the numbers of frequencies are shown along the y-axis (ordinate) 4. It is used to understand the temporal changes of uni-variate data and to compare the changes of two or more variables with time 4. It is used to understand the nature of the frequency distribution, drawing of frequency polygon and frequency curve, estimation of the value of mode etc. 2.5.5.3 Frequency Polygon Frequency polygon is the graphical portrayal of grouped frequency distribution alter- native to histogram andmay be looked upon as it is derived from histogram by joining the mid-points of the tops of successive rectangles by straight lines. In construction, the frequency polygon is obtained by joining the consecutive points by straight lines whose abscissae indicate the classmark (xi ) and ordinates indicate the corresponding class frequencies ( fi ) (Figs. 2.50, 2.51, 2.52 and 2.53). Two main assumptions for constructing polygon are: 2.5 Types of Graphical Representation of Data 137 Fig. 2.50 Frequency polygon showing the average concentration of SPM in the air Fig. 2.51 Frequency polygon showing the monthly income of families (i) All the values in a particular class are uniformly distributed within the whole range of the class interval. Thus the class mark (xi ) is considered to be the representative of the corresponding class. (ii) For the same frequency distribution, the area covered by the histogrammust be equivalent to the area enclosed within the frequency polygon. In this purpose, the two endpoints of the polygon are joined by straight lines to the abscissa at the mid values (class marks, xi ) of the empty classes at each end of the frequency distribution (Figs. 2.50, 2.51, 2.52 and 2.53). Frequency polygon may be plotted separately and individually as well as on the histogram (Figs. 2.52 and 2.53). In the case of the distribution with unequal 138 2 Representation of Geographical Data Using Graphs Fig. 2.52 Histogram with polygon showing the average concentration of SPM (mg/m3) in the air Fig. 2.53 Histogram with polygon showing the monthly income of families classes, the polygon is drawn by plotting the frequency density ( fdi ) instead of simple frequency ( fi ) against the class mark (xi ) (Figs. 2.51 and 2.53). 2.5 Types of Graphical Representation of Data 139 Fig. 2.54 Frequency polygon of discrete variable (Distribution of landslide occurrences) Uses of Frequency Polygon The frequency polygon is especially useful in representing simple frequency distri- bution of any discrete variable. For example, day-wise distribution of number of land- slide occurrences can effectively be represented in a frequency polygon (Fig. 2.54). It gives us a better idea about the distribution of observations in different classes and the shape of the frequency curve. 2.5.5.4 FrequencyCurve In a generic sense, frequency curve is the modified form of histogram and frequency polygon. In drawing histograms, it is assumed that the observations (frequencies) are homogeneously distributed all through the range of valueswithin the class boundaries of any class, but this may not be always true. Actually, a histogram provides the approximate idea about the nature and pattern of distribution of a limited number of frequencies (no. of observations) in different classes. The widths of the classes of the frequency distribution could be made smaller, but the problem is that some of the classes may remain empty (classes without any class frequency) and the actual pattern of the distribution of observations in the population will not be understood. 140 2 Representation of Geographical Data Using Graphs Fig. 2.55 Frequency curve showing the average concentration of SPM (mg/m3) in the air But if the number of observations is very large, the situation will improve and all the classes are expected to have some number of frequencies, even when the widths of the classes are significantly small. If the class width becomes smaller and smaller along with an indefinite increase of the total frequency, then the histogram and the frequency polygon tend to move towards a smooth curve called frequency curve (Das 2009). Generally, it is stated that in the frequency curve, the points obtained from the plotting of class frequency ( fi ) or frequency density ( fdi ) against the class mark (xi ) are joined by a smooth curve instead of a series of straight lines (Figs. 2.55 and 2.56). Frequency curve represents the probability distribution of the variables in the population along with its area enclosed by the ordinate (‘y’-axis) at two specified points on the abscissa (‘x’-axis) indicating the probability of lying a value of the variable between these two extremes. Like histogram and frequency polygon, frequency curve is also an area graph. Based on their shape and characteristics, frequency curve is of four types (Das 2009): (i) Symmetrical bell-shaped or normal curve (Fig. 2.57a), (ii) Asymmetrical single-humped (Fig. 2.57b), (iii) J-shaped curve (Fig. 2.57c) and (iv) U-shaped curve (Fig. 2.57d). Shape of the Frequency Curve The shape of the frequency curve is very important as it represents the actual nature of a frequency distribution. It is generallymeasured in terms of two geometric properties of a frequency curve—(1) Symmetry or asymmetry (skewness) and (2) peakedness (kurtosis). 2.5 Types of Graphical Representation of Data 141 Fig. 2.56 Frequency curve showing the monthly income of families Fig. 2.57 Types of frequency curve 142 2 Representation of Geographical Data Using Graphs Fig. 2.58 Positive, negative and zero or no skewness Skewness (Sk) An important property of the shape of a frequency curve is whether it has one peak (unimodal) or more than one (bi-modal or multimodal). If it is unimodal (one peak), like most data sets, emphasis should be given to know whether it is symmetrical or asymmetrical in shape. Skewness measures the symmetry, or more accurately, the lack of symmetry of the frequency curve. Skewness signifies the extent of asymmetry of the frequency curve and is of three types: (a) Positive skewness If most of the observations in a data set are located at the left of the curve (peak is towards the lower class boundaries) and the right tail is longer, then the distribution is said to be skewed right or positively skewed. In a positively skewed frequency distribution or curve, the relation between three measures of central tendency is mean > median > mode, i.e. the value of mean is greater than the median and again the value of median is greater than the mode (Fig. 2.58). (b) Negative skewness If most of the observations in a data set are located at the right of the curve (peak is towards the upper-class boundaries) and the left tail is longer, then the distribution is said to be skewed left or negatively skewed. In a negatively skewed frequency distribution or curve, the relation between three measures of central tendency is mean < median < mode, i.e. the value of mean is lower than the median and again the value of median is lower than the mode (Fig. 2.58). (c) Zero or no skewness or symmetric A frequency distribution or curve is said to be symmetrical in nature when the values are uniformly distributed around the mean. In such conditions the curve looks identical to the left and right of the central point. In a symmetrical or zero 2.5 Types of Graphical Representation of Data 143 skewed frequencydistribution or curve, the relation between threemeasures of central tendency is mean = median = mode, i.e. the values of mean, median and mode are equal (Fig. 2.58). Skewness can be measured in different methods: (1) Pearson’s first measure Skewness = Mean − Mode Standard deviation (2.48) (2) Pearson’s second measure: Skewness = 3(Mean − Median) Standard deviation (2.49) (3) Bowley’s measure: Skewness = Q3 − 2Q2 + Q1 Q3 − Q1 (2.50) where Q1, Q2 and Q3 are lower, middle and upper quartiles, respectively. (4) Moment measure of skewness is called skewness coefficient, β1 (read as beta- one): Skewness coe f f icient(β1) = μ3 σ 3 (2.51) where μ3 is the third central moment and σ is the standard deviation. μ3 = ∑n i=1,2....(xi − x)3 N (for ungrouped data) (2.52) μ3 = ∑n i=1,2.... fi (xi − x)3 N (for grouped data) (2.53) In statistics, ‘moment’ (μ) is the mean of the first power of the deviation, i.e. the spacing of the size class or individual item in the frequency distribution from the mean, adding them up and dividing them by the total size of the distribution. This is the first moment (μ1) about the mean which is symbolically written as μ1 = ∑|xi − x | N (2.54) Higher moments (μ2, μ3, μ4 etc.) can be defined in the sameway. In symmetrical frequency distribution bothμ1 andμ2 are zero (0), so the skewness coefficient would also be zero (0). 144 2 Representation of Geographical Data Using Graphs According to Pearson’s measures, there are no theoretical limits of skewness. But generally the value lies between +3 and −3. According to Bowley’s measurement, the value of skewness ranges between +1 and −1. Normal distribution (Normal Curve) Normal distribution, also called Gaussian distribution (after the name of the mathe- matician Gauss), is a continuous probability distribution and is defined by the prob- ability density function, f (x) which is the height (Y ) of the normal curve above the baseline at a given point (xi ) along the measurement scale of the random variable, x itself (Pal 1998). The model used to obtain the desired probabilities is Y = f (xi ) = 1 σx √ 2π e − 1 2 ( xi −μ σ )2 (2.55) where e is the exponent (2.71828), π is the mathematical constant (3.14159), μ and σ are the population mean and standard deviation, respectively, xi is any value of the continuous random variable (−∞ < xi < +∞). In other words, a frequency distribution having skewness = 0 (Sk = 0) is called a normal probability distribution. The probability curve of the normal distribution is called normal curve. The curve is symmetrical and bell-shaped (Figs. 2.58 and 2.59) and the two tails extend to infinity on either side. In the real world, many actual distributions like rainfall data for any raingauge station collected over a large number of years (assuming no climatic change) tends to develop normal frequency distribution with a ‘bell-shaped’ symmetrical curve. This symmetry means that the height of the normal curve is the same if one moves equal distances to the left and right of the mean. The highest frequencies in this curve are around the mean and the frequency decreases as the distance from the mean increases. If we know the value of μ and σ , it is possible to determine the estimates of Y function for constructing the normal curve and calculating the area under the curve of any interval of xi (Table 2.31). Computation of the value of Yfor every value of xi is tedious and hence the position of any particular observation (say a score xi ) may be expressed relative to other scores in the data set by getting itself transformed to a standardized normal random variable known as ‘standard score’ or a ‘z-score’ (read as ‘zee’) where z = xi − μ σ = Critical value − Mean Standard deviation (2.56) If all of the raw values in a distribution are converted to standard or z-scores, we get a new standardized distribution always having a ‘mean, μ equal to zero (0)’ and a ‘standard deviation, σ equal to one (1)’. This is a useful transformation known as standardization which results in new values for the individuals. As the normal curve is symmetrical about the mean, half of the area under the curve lies on each side 2.5 Types of Graphical Representation of Data 145 Fig. 2.59 Area under a standard normal curve Table 2.31 Methods of calculating Y in f (x) for constructing a normal curve xi (rainfall in cm) [μ = 50 cm and σ = 10 cm] 1 σ √ 2π xi −μ σ ( xi −μ σ )2 − 1 2 ( xi −μ σ )2 e − 1 2 ( xi −μ σ )2 Y (see Eq. 2.55) 25 0.03989 −2.50000 6.25000 −3.12500 0.04393 0.00175 45 0.03989 −0.50000 0.25000 −0.12500 0.88249 0.03520 50 0.03989 0.00000 0.00000 0.00000 1.00000 0.03989 65 0.03989 1.50000 2.25000 −1.12500 0.32465 0.01295 75 0.03989 2.50000 6.25000 −3.12500 0.04393 0. 00175 of the mean. This reveals that in a frequency distribution approximating the normal curve, 50% of the values will be less than the mean and 50% will be greater than the mean (Fig. 2.59). Properties of Normal Curve Normal curve has a number of interesting properties. These include: 146 2 Representation of Geographical Data Using Graphs (i) Normal curve (normal distribution) has two important parameters μ and σ , μ = mean and σ = standard deviation. In some literature works, μ is also used as x . (ii) The normal curve is always ‘bell-shaped’ and unimodal in nature. (iii) It is symmetrical about its centre (the line x = μ) and mean occupies the centre. The vertical line drawn through the mean divides the curve into two equal halves. Thus, 50% of the values are less than the mean and 50% are greater than the mean. (iv) The values obtained from the addition or subtraction of the normally distributed values are also normally distributed. (v) Three measures of central tendencies are equal in value (i.e. mean = median = mode = μ), so on curve they coincide with each other. (vi) Normal curve has two points of inflections (the pointwhere the curve changes curvature) at a distance of±σ on either side of themean (μ). Thus, the normal curve is convex upward in the interval (μ − 1σ and μ + 1σ) and concave upward outside this interval. (vii) Two tails of a normal curve are asymptotic to the x-axis or horizontal axis, i.e. if two tails of a normal curve are extended in both the directions to infinity, they never cut the x-axis. (viii) The form of the normal curve in terms of its shape (skewness) and height (kurtosis) depends on the mean and the standard deviation. (ix) The percentage distribution of the area under a standard normal curve is (Table 2.32 and Fig. 2.59): (a) 68.26% (68%) between μ ± 1σ (b) 95.44% (95%) between μ ± 2σ (c) 99.74% (99%) between μ ± 3σ So, almost all the values of x will lie between the limits μ ± 3σ , i.e. mean ± 3(S · D.). Kurtosis In a frequency curve, it is required to know the ‘convexity of the curve’ which is ‘kurtosis’ (Greekwordmeaning thereby ‘bulkiness’). Twoormore sets of data having an equal average, spread and symmetry but may differ in respect of their degree of peakedness. Kurtosis measures the degree of peakedness or convexity of a frequency curve, i.e. the extent to which values are concentrated in one part of the curve. It explains whether the distribution in the data set is having an excessively large or small number of values (observations) in the intermediate ranges between the mean and the extreme values and thus resulting in a peakedness or flat-toppedness of the frequency curve. Kurtosis is the fourth moment about the mean, μ4. According to Pearson, it is a coefficient, the kurtosis coefficient, β2 (read as beta-two). Kurtosis coefficient (β2) 2.5 Types of Graphical Representation of Data 147 Table 2.32 Standard normal distribution table STANDARD NORMAL DISTRIBUTION (Values represent area to the left of the Z score) (The left column stands for the first decimal value of z and the top row stands for the second decimal value of z) z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517 0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879 0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224 0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852 0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133 0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389 1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621 1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830 1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015 1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177 1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319 1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441 1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545 1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633 1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706 1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767 2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817 2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857 2.2 0.9861 0.9864 0.9867 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890 2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916 2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936 2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952 2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964 2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974 2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981 2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986 3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990 3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993 3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995 3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997 (continued) 148 2 Representation of Geographical Data Using Graphs Table 2.32 (continued) STANDARD NORMAL DISTRIBUTION (Values represent area to the left of the Z score) (The left column stands for the first decimal value of z and the top row stands for the second decimal value of z) z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998 3.5 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 3.6 0.9998 0.9998 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 3.7 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 3.8 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 3.9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 is obtained using the following formula: β2 = μ4 μ2 2 (2.57) Simply it can be written as β2 = μ4 σ 4 (2.58) where μ4 is thefourth central moment and σ is the standard deviation. μ4 = ∑n i=1,2....(xi − x)4 N (for ungrouped data) (2.59) μ4 = ∑n i=1,2.... fi (xi − x)4 N (for grouped data) (2.60) Since kurtosis indicates the spread of the frequency curve, it is determined as Kurtosis measure, K = Mean − Median Standard deviation (2.61) For a symmetrically distributed curve, as the mean and median coincides with each other, the kurtosis measure, K should be zero and the kurtosis coefficient, β2 should be 3.0, i.e. K = β2 − 3, where with K = 0, β2 becomes 3.0. Based on the peakedness of the frequency curve, three types of curves are identified: (1) Leptokurtic (Positive kurtosis): When β2 > 3 and K > 0, then the curve is more peaked than the perfect symmetrical curve and it is knownas ‘leptokurtic’, i.e. the curve with a narrower central position and a higher tail than a perfect symmetrical curve (Fig. 2.60). 2.5 Types of Graphical Representation of Data 149 Fig. 2.60 Degree of peakedness (Kurtosis) of frequency curve (2) Platykurtic (Negative kurtosis): When β2 < 3 and K < 0, then the curve is less peaked (more flat) and it is known as ‘platykurtic’, i.e. the curve with a broader central portion and a lower tail than the perfect symmetrical curve (Fig. 2.60). (3) Mesokurtic (Normal curve): From the kurtosis point of view, the perfect symmetrical curve is the curve having β2 = 3 and K = 0 and it is known as ‘mesokurtic’ (Fig. 2.60). Uses of Frequency Curve (i) Different measures of central tendency and dispersion can be easily plotted on it. (ii) Plotting of frequency curves on the same base becomes effective to compare sets of data series. (iii) The curve is also helpful to determine the normality of a data set. 2.5.5.5 Cumulative Frequency Polygon and Curve (Ogive) The graphical representation of cumulative frequencies in a frequency distribution is called cumulative frequency polygon. For the drawing of this graph, the cumulative frequencies are plotted along the ‘Y ’-axis against the corresponding class boundaries, plotted along the ‘X’-axis and the obtained points are joined by straight lines and this line is known as cumulative frequency polygon. Generally, two distinctive polygons are drawn: (i) Less than type and (ii) more than type. (i) Less than type: Less than type polygon is drawn based on the less than type cumulative frequencies (Tables 2.33 and 2.34). It begins from the lowest class boundary on the horizontal axis (abscissa or ‘X’-axis), continues to rise upward 150 2 Representation of Geographical Data Using Graphs Table 2.33 Worksheet for drawing Ogive (with equal class size) Class boundary Frequency (f i) Cumulative frequency (F) Less than F More than F 170.5–220.5 9 170.5 0 170.5 25 220.5–270.5 4 220.5 9 220.5 16 270.5–320.5 2 270.5 13 270.5 12 320.5–370.5 6 320.5 15 320.5 10 370.5–420.5 3 370.5 21 370.5 4 420.5–470.5 1 420.5 24 420.5 1 N = ∑ fi = 25 470.5 25 470.5 0 Table 2.34 Worksheet for drawing Ogive (with unequal class size) Class boundary Frequency ( fi ) Cumulative frequency (F) Less than F More than F 0–50 40 0 0 0 225 50–120 60 50 40 50 185 120–250 45 120 100 120 125 250–350 35 250 145 250 80 350–600 25 350 180 350 45 600–950 20 600 205 600 20 N = ∑ fi = 225 950 225 950 0 and ends at the highest class boundary corresponding to the total frequency (N) of the distribution. The less than polygon looks like a broad and elongated S-shape. (ii) More than type: Contrary to less than type polygon, it is drawn based on more than type cumulative frequencies (Tables 2.33 and 2.34). It begins from the total frequency (N) at the lowest class boundary and progressively descends to the highest class boundary on the horizontal axis (abscissa or ‘X’-axis). More than type polygon looks like a broad, elongated but inverted letter S. Cumulative frequency curve is themodified formof cumulative frequencypolygon in which the plotted points are joined by smooth freehand curves as an alternative of straight lines (Figs. 2.61 and 2.62). The combined representation of less than type and more than type cumulative frequency curves looks like a wine-glass called wine-glass curves. A combined representation of less than and more than cumulative frequency polygons or curves is calledOgive (Figs. 2.61 and 2.62). The two polygons or curves intersect at the median point of the distribution. The method of graphical construction of Ogives in frequency distribution with unequal class widths is the same as in the case of equal widths of the classes in frequency distribution. 2.5 Types of Graphical Representation of Data 151 Fig. 2.61 Cumulative frequency curve (Ogive) showing the average concentration of SPM (mg/m3) in air Fig. 2.62 Cumulative frequency curve (Ogive) showing the monthly income of families Uses of Cumulative Frequency Polygon and Curve (Ogive) A cumulative frequency curve (Ogive) is more useful than a frequency curve to understand the content of a frequency distribution. The uses of Ogive include: (1) As the Ogive is the only graphical representation of the cumulative frequency distribution, it is very useful to find the values of median, quartiles, deciles and percentiles graphically. (2) The number of observations (frequencies) lying below or above a particular value, in between any two specified values can be easily found from the Ogive. (3) It is also useful to find out the cumulative frequencies above or below a certain specified value of the variable. 152 2 Representation of Geographical Data Using Graphs References AlhamedM,AhmadAli S (2017)Hypsometric curve and hypsometric integral analysis of theAbdan Basin, Almahfid Basement Rock, Yemen. National seminar on recent advances and challenges in geochemistry, Env Sed Geol Chow VT (1959) Open channel hydraulics. McGraw-Hill, New York Das NG (2009) Statistical methods (Volume I & II). McGraw Hill Education (India) Pvt Ltd ISBN: 978-0-07-008327-1 Geddes A, Ogilvie AG (1938) The technique of regional geography. Jour of MGS 13(2):121–132 Mahmood A (1999) statistical methods in geographical studies. Rajesh Publication. ISBN: 9788185891170, 81-85891-17-6 Mitra A (1964) A functional classification of Indian towns. Institute of Economic Growth, India Pal SK (1998) Statistics for geoscientists: techniques and applications. Concept Publishing Company, New Delhi, ISBN: 81-7022-712-1 Saksena RS (1981) A handbook of statistics. Indological Publishers & Booksellers Sarkar A (2015) Practical geography: a systematic approach. Orient Blackswan Private Limited, Hyderabad, Telengana, India, ISBN: 978-81-250-5903-5 Siddhartha K, Mukherjee S (2002) Cities, Urbanisation and urban systems. Kisalaya Publications. ISBN: 81-87461-00-4 Singh VP (1994) Elementary hydrology. Prentice Hall of India Private Limited, New Delhi Singh RL, Singh RPB (1991) Elements of practical geography. Kalyani Publishers Sokolov AA, Chapman TG (eds) (1974) Methods for water balance computations. An international guide for research and practice, studies and reports in hydrology 17. UNESCO Press, Paris Strahler A (1952) Dynamic basis of geomorphology. Geol Soc Am Bull 63:923–938. https://doi. org/10.1130/0016-7606(1952)63[923:DBOG]2.0.CO;2 Sutcliffe et al (1981) The water balance of the Betwa basin, India/Le bilan hydrologique du bassin versant de Betwa en Inde. Hydrol Sci Bull 26(2):149–158. https://doi.org/10.1080/026266681 09490872[J.V.SUTCLIFFE,R.P.AGRAWAL&JULIAM.TUCKER] Taylor TG (1949) The control of settlement by humidity and temperature (with special reference to Australia and the Empire): an introduction to comparative climatology. Melbourne, VIC, Commonwealth Bureau of Meteorology ZipfGK (1949) human behaviour and the principle of least effort, An introduction to human ecology. Addison-Wesle, Cambridge, MA https://doi.org/10.1130/0016-7606(1952)63[923:DBOG]2.0.CO;2 https://doi.org/10.1080/02626668109490872[J.V.SUTCLIFFE,R.P.AGRAWAL&JULIAM.TUCKER] Chapter 3 Diagrammatic Representation of Geographical Data Abstract Diagrammaticrepresentation and visualization of geographical data is very simple, attractive and easy to understand and explain to the geographers as well as to the common literate people. It helps to explore the nature of data, the pattern of their spatial and temporal variations and understanding their relationships to accurately recognize and analyse features on or near the earth’s surface. This chapter focuses on the detailed discussion of various types of diagrams classified on a different basis. All types of one-dimensional (bar, pyramid etc.), two-dimensional (circular, triangular, square etc.), three-dimensional (cube, sphere etc.) and other diagrams (pictograms and kite diagram) have been discussed with suitable exam- ples in terms of their appropriate data structure, necessary numerical (geometrical) calculations, methods of construction, appropriate illustrations, and advantages and disadvantages of their use. It includes all the fundamental geometric principles and derivation of formulae used for the construction of these diagrams.A step-by-step and logical explanation of their construction methods becomes helpful for the readers for an easy and quick understanding of the essence of the diagrams. Each diagram repre- sents a perfect co-relation between the theoretical knowledge of various geographical events and phenomena and their proper practical application with suitable examples. Keywords Diagrammatic representation · Geometric principles · One-dimensional diagram · Two-dimensional diagram · Three-dimensional diagram 3.1 Concept of Diagram Diagram is another important form of visual representation of geographical data in which importance is laid on the basic facts of one selected element. In the diagram, data are represented in a very much abstract and conventionalized geometric form. All types of categorical and geographical data, including time series and spatial series data, can be easily represented in diagrams. Representation of different geographical data by suitable diagrams is easy to understand and appreciated by all the people without having geographical, geometrical and statistical knowledge. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Maity, Essential Graphical Techniques in Geography, Advances in Geographical and Environmental Sciences, https://doi.org/10.1007/978-981-16-6585-1_3 153 http://crossmark.crossref.org/dialog/?doi=10.1007/978-981-16-6585-1_3&domain=pdf https://doi.org/10.1007/978-981-16-6585-1_3 154 3 Diagrammatic Representation of Geographical Data 3.2 Advantages and Disadvantages of Data Representation in Diagrams Advantages The advantages of representation of data in diagram are: (i) Diagram makes the data simple, attractive and impressive, so easily intelli- gible to all. (ii) It saves a considerable amount of time, labour and energy. (iii) Comparison of two or more sets of data becomes possible and easy. (iv) It has universal utility, i.e. the technique is used all over the world. (v) It becomes helpful to detect the errors in data if any. (vi) Various complex data can be easily and simply represented by diagrams. (vii) It has an immense memorizing effect. Disadvantages In spite of all these advantages, representation of data by diagram has some limitations: (i) Diagram does not provide a detailed description of data (ii) Data can’t be represented completely and accurately in a diagram. (iii) Portrayal of small variation in two ormore sets of data is difficult in a diagram. (iv) Most of the diagrams are useful to the common people but these are of little significance to the professionals. Again, some of the diagrams (three-dimensional or multi-dimensional diagrams) are specifically useful to professionals and experts. (v) Depiction of three or more sets of data becomes difficult and impossible in diagram. (vi) Diagram does not represent the overall characters of data, rather it signifies the general conditions only. (vii) The application of the diagram is very limited in applied research. 3.3 Difference Between Graph and Diagram Graphs and diagrams are two important techniques for the representation of statistical as well as geographical data, but major differences between them are as follows: 3.3 Difference Between Graph and Diagram 155 Graph Diagram 1. Representation of series of data on graph paper either by means of Cartesian or polar or oblique co-ordinates on a reference frame is called graph 1. Representation of data in a highly abstract and conventionalized geometric form on two-dimensional plain paper is called diagram 2. Principles of co-ordinate geometry are applied immensely for the drawing of graphs 2. Some geometrical principles are applied but co-ordinate geometry is little or insignificantly applied in diagrams 3. Easy to draw, because it requires only the knowledge of co-ordinate geometry 3. Drawing of diagrams requires efficiency, experience and artistic knowledge. Lack of these will make the diagram less attractive and impressive 4. Graph depicts the functional or mathematical relationship between two or more variables 4. Diagram does not depict any functional or mathematical relationship between variables; it is used for comparisons only 5. Generally, time series data and frequency distribution data are appropriately represented in a graph 5. Diagrams are constructed for representing categorical data, including time series and spatial series data 6. Graphs are very much suitable and used for statistical analysis of geographical data 6. Diagrams are less suitable for the statistical analysis of geographical data 7. The value of median and mode can be easily estimated from a graph 7. Median and mode can’t be estimated from a diagram 8. Graphs are less attractive and impressive to the eye 8. Diagrams are attractive to the eye and are better suited for publicity and propaganda 9. In the graph, data are represented by points or lines 9. In the diagram, data are represented by bars, pies, rectangles, squares, spheres etc. 3.4 Types of Diagrams in Data Representation Based on the type and nature of data, the following categories of diagrams can be distinguished, i.e. statistical diagrams, geographical diagrams and statistical- geographical diagrams (Sarkar 2015). In addition to this, on the basis of the geometry of the figures to be constructed, diagrams may be classified into different types from which the geographers or researchers have to select the most suitable one (Table 3.1). 3.4.1 One-Dimensional Diagrams It is the diagram in which the size of only one dimension, i.e. length, is considered to be fixed in proportion to the value of the data it represents. 156 3 Diagrammatic Representation of Geographical Data Table 3.1 Types of diagrams 3.4.1.1 Bar Diagram The representation of statistical or geographical data in the form of bars is called bar diagram. It consists of a number of bars that are equal in width and equally spaced. The bars are drawn on a common baseline on which the length or height of the bar is directly proportional to the value it signifies. Based on the constructional arrangements of the bars, three categories are identified: (i) Vertical or columnar bar diagram: Bars are drawn vertically above the abscissa (x-axis). (ii) Horizontal bar diagram: Bars are drawn parallel to the abscissa along the ordinate (y-axis). (iii) Pyramidal bar diagram: Horizontal bars are arranged in such a way that it forms a pyramid. Principles of Construction of bar Diagrams Though there is no hard and fast rule in constructing bar diagram but some important principles are followed for drawing it. 1. The bars should be neither too short nor too long. In other words, the bars should be proportionate in length and breadth. 2. The baseline from which bars are drawn should be clearly shown. 3. The scale should be mentioned clearly and accurately. 4. The intervening space between bars should be equal, i.e. bars should be drawn at an equal distance from eachother. 5. The bars should be coloured or shaded in order to make them impressive and attractive. 6. Generally, vertical bars are used to represent the time series data whereas hori- zontal bars are used to depict the data classified geographically or data classified by their attributes. 3.4 Types of Diagrams in Data Representation 157 Fig. 3.1 Vertical simple bar (Temporal change of urban population in India since independence) Source Census of India, 2011 Advantages and Disadvantages of the Use of bar Diagrams The major advantages and disadvantages of the use of bar diagrams are as follows: Advantages 1. Drawing and understanding of bar diagram is very simple and easy. 2. A large number of data can be easily represented in bar diagram. 3. Bar diagram can be drawn either vertically or horizontally. 4. It facilitates comparison of different data series. Disadvantages 1. It is very difficult to represent a large number of aspects of any data in bar diagram. 2. The drawer fixes the width of the bars arbitrarily. Types of Bar Diagrams Simple Bar Diagram The bar diagram showing only one component or category of data is called simple bar diagram. In this type each bar represents a single value only (Tables 3.2 and 3.3). It can be drawn either on a horizontal base (Fig. 3.1) or a vertical base (Fig. 3.2), but bars on a horizontal base are frequently used. The width of bars must be equal and they 158 3 Diagrammatic Representation of Geographical Data Fig. 3.2 Horizontal simple bar (Total population in selected states in India) SourceCensus of India, 2011 Table 3.2 Data for vertical simple bar diagram (Temporal changes of urban population in India) Year Urban population Scale selection Height of each bar (cm) 1951 62,443,709 1 cm to 1,00,000,000 urban population 0.62 1961 78,936,603 0.79 1971 109,113,977 1.09 1981 159,462,547 1.59 1991 217,177,625 2.17 2001 285,354,954 2.85 2011 361,986,870 3.62 Source Census of India, 2011 should be spaced with equal distance from one another. The scale for constructing simple bar diagram should be selected based on the highest and lowest values of the data to be represented. Example Total population in different states of India, total population in different years or decades in India, year-wise production of wheat in India, state-wise produc- tion of rice in India, coal production in different countries in the world etc. can be represented by simple bar diagram. 3.4 Types of Diagrams in Data Representation 159 Table 3.3 Data for horizontal simple bar diagram (Total population in selected states in India, 2011) Name of the state Total population (2011) Scale selection Length of each bar (cm) Uttar Pradesh 199,581,477 1 cm to 50,000,000 population 3.99 Maharashtra 112,372,972 2.24 Bihar 103,804,673 2.08 West Bengal 91,347,736 1.82 Andhra Pradesh 84,665,533 1.70 Madhya Pradesh 72,383,628 1.44 Tamil Nadu 72,138,958 1.44 Source Census of India, 2011 An important limitation of simple bar diagrams is that they can represent only one component or one category of data. For example, while depicting the total popu- lation of different states in India, we can depict only the total population but sex- wise distribution of population in different states can’t be represented in simple bar diagram. Multiple Bar Diagram Bar diagram in which different bars or proportionate lengths are drawn side by side representing the components is calledmultiple bar diagram (Fig. 3.3). It is generally used to compare two or more sets of statistical or geographical data (Table 3.4). In order to discriminate different bars, they are either differently coloured or different types of crossings or dottings are used in them. Multiple bar diagram is always equipped with an index or legend to indicate the meaning of different dotting or colours (Fig. 3.3). Sub-Divided or Compound Bar Diagram A compound bar diagram is one in which a single bar is sub-divided into different parts in proportion to the values given in the data (Fig. 3.4). When the data is composed of more than one component within a total then compound bar diagram is used (Table 3.5). The single bar represents the total or aggregate value while the component parts represent the component values of the aggregate. Sub-divisions in a bar are distinguished by using different colours or dottings or crossings. A legend or index is also given to indicate the meaning of different colours or dottings. This diagram reflects the relation among different components and also between different components and the aggregate. Compound bar diagram is also known as composite or component bar diagram. 160 3 Diagrammatic Representation of Geographical Data Fig. 3.3 Multiple bars showing the continent-wise urban population (%) in 2000 and 2025* Source UN Population Division, 2009–2010 and The World Guide, 12th ed. * Projected figures Table 3.4 Calculations for multiple bar diagram (Continent-wise urban population) Name of the continent Percentage (%) of urban population Scale selection Height or length of each bar (cm) 2000 2025* 2000 2025* Africa 38.5 49.6 1 cm to 20% urban population 1.92 2.48 Europe 75.0 90.0 3.75 4.5 Anglo America 80.0 86.0 4.0 4.3 Latin America 68.5 75.0 3.42 3.75 Asia 40.0 50.0 2.0 2.5 Oceania 72.5 75.3 3.62 3.76 * Projected figures Source UN Population Division, 2009–2010 and The World Guide, 12th ed Percentage Bar Diagram Percentage bar diagram is a special form of sub-divided bar diagram in which the value of each component is converted into a percentage (%) of the whole (Fig. 3.5 and Table 3.6). The basic difference between these two bar diagrams is that in the sub-divided bar diagram the bars are of different heights as their total values may be different, but in percentage bar diagram bars are equal in height as each bar 3.4 Types of Diagrams in Data Representation 161 Fig. 3.4 Sub-divided bar (Production of different crops in selected years in India) SourceMinistry of Agriculture and Economic Survey, 2010–2011 and Husain, 2014 Table 3.5 Calculations for sub-divided bar diagram (Production of different crops in India, 1950– 1951 to 2010–2011) Name of crops Production in different time periods (million tonnes) Scale selected Height of the bar (cm) 1950–51 1970–71 2010–2011 1950–51 1970–71 2010–2011 Rice 30.8 37.6 95.32 1 cm to 40 million tonnes 0.77 0.94 2.38 Wheat 9.7 18.2 85.93 0.24 0.45 2.15 Pulses 12.5 13.4 28.0 0.31 0.33 0.7 Coarse-grains 15.5 31.4 30.0 0.39 0.78 0.75 Source Ministry of Agriculture and Economic Survey, 2010–2011 and Husain, 2014 represents 100% value (Fig. 3.5). For the data having more than one component, the percentage bar diagram will be more appropriate and convincing than the sub- divided bar diagram as the former becomes more helpful in comparison of different components. 162 3 Diagrammatic Representation of Geographical Data Fig. 3.5 Percentage bar showing the proportion of population in different age groups in selected states in India Source Census of India, 2011 Table 3.6 Calculations for percentage bar diagram (Proportion of population in different age groups in selected states in India, 2011) Name of the state Age-wise (years) population (%) Total (%) Scale selected Height of the bar (cm) Total (cm) 0–14 15–59 ≥60 0–14 15–59 ≥60 Uttar Pradesh 33.7 59.5 6.8 100 1 cm to 20% population 1.68 2.97 0.35 5 Maharashtra 27.2 63.6 9.3 100 1.36 3.18 0.46 5 Bihar 37.3 55.8 7.0 100 1.86 2.79 0.35 5 West Bengal 25.5 66.3 8.2 100 1.27 3.31 0.42 5 Andhra Pradesh 24.6 66.6 8.8 100 1.23 3.33 0.44 5 Madhya Pradesh 32.1 60.8 7.1 100 1.60 3.05 0.35 5 Tamil Nadu 23.4 66.1 10.5 100 1.18 3.30 0.52 5 Source Census of India, 2011 3.4 Types of Diagrams in Data Representation 163 3.4.1.2 Pyramids It is a specific and typical type of bar diagram in which the bars are placed in such a way that it forms a pyramid-like structure. Pyramid diagrams are popularly used in different branches of geography including, population studies,urban studies, ecological or ecosystem studies etc. in different forms for the proper and accurate representation of geographical data. Pyramids in Population Studies (Age–sex Pyramid) In population geography, pyramid is generally used to represent the age–sex compo- sition of the population (age–sex pyramid) of a country or region. Different age groups are shown vertically, the base representing the youngest group and the apex representing the oldest group, whereas the male and female population are shown horizontally, male population being to the left and female population being to the right of the pyramid. Generally, age groups are considered to be uniform and hori- zontal bars of uniform width are drawn. But, if the age groups are unequal then the width of the bars becomes unequal. Population pyramids may be drawn in two ways: (i) based on absolute numbers of the male and female population (Fig. 3.6a) and (ii) the proportion or percentage of the male and female population with respect to the total (Fig. 3.6b). There are two possible methods for the conversion of percentage values from absolute numbers of male and female: First, the numbers of female population in each age group may be Fig. 3.6 a Absolute population pyramid and b percentage population pyramid 164 3 Diagrammatic Representation of Geographical Data expressed as the percentage of the total female population of a country or region. The percentage of male population in each age group may be calculated in the same way (Table 3.7). Secondly, the male and female population in each age group may be expressed as the percentage of the total population of a country or region. The pyramid of absolute numbers shows the size and composition of the population of a country or region, whereas the pyramid in percentage is used to compare the age–sex composition of population of two or more countries or regions, on a single scale of which one is small and the other is big. The work participation status of the male and female population in different age groups can be easily represented by pyramid diagram. Pyramids in Ecological Studies The use of pyramid diagram is very popular in ecological or ecosystem studies. The structure and function of successive trophic levels, i.e. producers, primary consumers, secondary consumers and tertiary consumers may be represented graph- ically bymeans of ecological pyramids (Sharma 1975). In this pyramid, the producer (commonly the green plants) constitutes the base of the pyramid (Trophic level-1) and successive levels or tiers are occupied by the organisms of different consumer levels making an apex (Fig. 3.7). Based on the number of organisms, biomass and energy at different trophic levels, ecological pyramids are of three types: (i) Pyramid of numbers: It portrays the number of organisms at different trophic levels which commonly decreases from base to apex. (ii) Pyramid of biomass: It represents the total dry weight of the total amount of living organisms or matters. (iii) Pyramid of energy: It represents the rate of energy flow and/or productivity at different trophic levels. All these ecological pyramids are commonly upright in shape. But, pyramids of number and biomass are sometimes inverted in shape depending upon the nature and character of the food chain of a particular ecosystem. Pyramids in Urban Studies In urban geography, the absolute numbers or percentage distribution of urban areas or cities based on different size classes may be represented in the form of a pyramid (Table 3.8). Here, the number or percentage of cities in each class is plotted horizon- tally against different size classes of the cities (plotted vertically). Thus symmetrical horizontal bars are placed on both sides of the size class column forming a pyramid- like structure (Fig. 3.8). The length of each bar is directly proportional to the number or percentage of cities it represents. Similarly, the number or percentage of population living in different size classes of towns or cities may also be portrayed in the form of urban pyramid. Generally, urban geographers around the world, rigorously and successfully use these types of 3.4 Types of Diagrams in Data Representation 165 Ta bl e 3. 7 W or ks he et fo r ag e- se x py ra m id (B as ed on th e po pu la tio n of Pu rb a M ed in ip ur di st ri ct ,W es tB en ga l, 20 11 ) A ge gr ou p A bs ol ut e nu m be r (’ 00 0) Sc al e se le ct ed L en gt h of th e ba r (c m ) Pe rc en ta ge (% ) Sc al e se le ct ed L en gt h of th e ba r (c m ) M al e Fe m al e M al e Fe m al e M al e Fe m al e M al e Fe m al e 0– 4 46 9 44 9 1 cm to 1, 50 ,0 00 m al e an d fe m al e 3. 13 3. 0 9. 54 9. 38 1 cm to 3% m al e an d fe m al e 3. 18 3. 13 5– 9 59 7 56 9 3. 98 3. 79 12 .1 4 11 .8 9 4. 05 3. 96 10 –1 4 59 8 57 0 3. 99 3. 8 12 .1 6 11 .9 0 4. 05 3. 96 15 –1 9 48 7 45 0 3. 25 3. 0 9. 90 9. 40 3. 3 3. 13 20 –2 4 42 6 43 5 2. 84 2. 9 8. 66 9. 09 2. 89 3. 03 25 –2 9 41 2 43 8 2. 75 2. 92 8. 38 9. 15 2. 79 3. 05 30 –3 4 38 0 35 7 2. 53 2. 38 7. 73 7. 46 2. 58 2. 49 35 –3 9 36 2 33 7 2. 41 2. 25 7. 36 7. 04 2. 45 2. 35 40 –4 4 28 4 32 3 1. 89 2. 15 5. 77 6. 75 1. 92 2. 25 45 –4 9 24 9 21 4 1. 66 1. 43 5. 06 4. 47 1. 69 1. 49 50 –5 4 16 7 15 0 1. 11 1. 0 3. 40 3. 13 1. 13 1. 04 55 –5 9 14 3 13 5 0. 95 0. 9 2. 91 2. 82 0. 97 0. 94 60 + 34 3 36 0 2. 29 2. 4 6. 97 7. 52 2. 32 2. 51 To ta l 49 17 47 87 10 0 10 0 So ur ce C en su s of In di a 166 3 Diagrammatic Representation of Geographical Data Fig. 3.7 Ecological pyramid (Pyramid of numbers) Fig. 3.8 Urban pyramid showing the percentage of towns in different size classes in India urban pyramids to understand and explain the structural characteristics of the urban system of any country or region. 3.4.1.3 Difference Between Histogram and Bar Diagram Though histogram and bar diagram are nearly similar in appearance, there are some specific and important differences between them. 3.4 Types of Diagrams in Data Representation 167 Table 3.8 Database for urban pyramid (Size class distribution of towns in India, 2011) Size class Number of towns Percentage (%) of towns Scale selected Length of the bar (cm) 1951 2011 1951 2011 1951 2011 Class-I (>100,000) 69 505 3.11 6.36 1 cm to 5% towns 0.62 1.27 Class-II (50,000–100,000) 107 605 4.82 7.63 0.96 1.53 Class-III (20,000–49,999) 363 1,905 16.36 24.01 3.27 4.80 Class-IV (10,000–19,999) 571 2,233 25.73 28.15 5.15 5.63 Class-V (5000–9999) 737 2,187 33.21 27.57 6.64 5.51 Class-VI (<5000) 372 498 16.76 6.28 3.35 1.26 Total 2,219 7,933 100 100 Source Census of India Histogram Bar diagram 1. Histogram refers to the graphical representation of statistical data by rectangles or bars drawn on a horizontal baseline to show the frequency of numerical data 1. Bar diagram is the diagrammatic representation of statistical data in the form of bars to compare different categories of data 2. It indicates the distribution of different continuous variables 2. It indicates the comparison of different discontinuous or discrete variables 3. It represents quantitative data 3. It represents categorical data 4. Class boundary is shown along the x-axis and the number of observations (frequency) is shown along the y-axis 4. Time, place or other categories are shown along the x-axis while the amount or quantity of information is shown in the y-axis 5. Bars or rectangles are adjoining or continuous, i.e. there is no space between bars 5. Bars are discontinuous, i.e. there is equal space between bars 6. The height of each bar is directly proportional to the frequency of the corresponding class 6. Lengths or heights of bars are directly proportional to the amount or quantity of information they represent 7. Data are grouped together so that they turn into continuous or are considered as ranges 7. Data are taken as individual entities 8.The width of the bars is the same in equal class size but different in unequal class size frequency distribution 8. The width is same for all the bars 9. It is difficult and impossible to reorder the bars 9. Bars can be easily reordered 10. More than one frequency distribution can’t be represented at a time 10. More than one component or variable can be easily represented at a time in a compound or complex bar diagram 168 3 Diagrammatic Representation of Geographical Data 3.4.2 Two-Dimensional Diagrams Unlike one-dimensional diagrams in which only the length is considered, in two- dimensional diagrams the length, as well as the breadth are taken into consideration. Thus, in two-dimensional diagrams the concept of area is very significant, called area diagrams or surface diagrams. Important two-dimensional diagrams are given in the following sub-sections. 3.4.2.1 Rectangular Diagram Rectangular diagram, an important two-dimensional method, may be used when two or more quantities are to be compared and each quantity is again sub-divided into various constituent parts (Saksena 1981) (Table 3.9). These are analogous to compoundbar diagrams as the length of bars are directly proportional to the quantities they indicate but the area of the rectangles and their constituent parts are kept in proportion to the values. Generally, the rectangles are placed side by side to make them comparable. In the case of the representation of two or more sets of data, if the scale is kept the same, the computation would be easier for construction. In rectangular diagram, data may be represented in two ways: (i) representation of the actual figures as they are given (Fig. 3.9) and (ii) by converting the actual figures into percentages (Fig. 3.10). The percentage sub-divided rectangular diagram Fig. 3.9 Rectangular diagram showing the area of irrigated land (hectares) by different sources in India 3.4 Types of Diagrams in Data Representation 169 Fig. 3.10 Rectangular diagram showing the area of irrigated land (%) by different sources in India Table 3.9 Calculations for rectangular diagram (Area of irrigated land by different sources of irrigation in India) Sources of irrigation 1950–1951 2000–2001 Irrigated area (thousand hectares) % Cumulative % Irrigated area (thousand hectares) % Cumulative % Canals 8,295 44.0 44.0 15,790 28.98 28.98 Wells and tube wells 5,980 31.7 75.7 33,275 61.07 90.05 Tanks 3,610 19.1 94.8 2,525 4.63 94.68 Others 970 5.2 100 2,900 5.32 100 Total 18,855 100 54,490 100 Source Statistical Abstracts of India, 2005–2006 and Husain, 2014 is more popular and acceptable than the absolute sub-divided rectangular diagram as the former enables the data easily comparable on a percentage basis. Different colours or dotting or crossings may be used to distinguish the constituent parts of the rectangles. Since the total irrigated land area in 1950–1951 and 2000–2001 are 18,855 and 54,490 thousand hectares, respectively, the width of the rectangles will be in the ratio of 18,855:54,490, i.e. 1:2.90. 3.4.2.2 Triangular Diagram Triangular diagram is an important two-dimensional diagram which represents a series of equilateral triangles in which the size and area of each triangle (‘a’) is directly proportional to the quantity (‘q’) it indicates (Sarkar 2015). Theoretically, 170 3 Diagrammatic Representation of Geographical Data a α q (3.1) or a = k.q (k = proportionality constant) If the side of the equilateral triangle having area ‘a’ is ‘l’, then √ 3 4 l2 = a (3.2) [Area of an equilateral triangle with side length ‘l’ = √ 3 4 l2] √ 3 4 l2 = k.q(a = k.q) l2 = 4k.q√ 3 l = √ k 4q√ 3 (3.3) So, for any item (i), corresponding to the quantity (q), the side of the equilateral triangle (l) can be represented by the following equation: li = √ 4qi√ 3 (3.4) For the drawing of triangles, a suitable scale should be selected carefully (Table 3.10) so that an individual triangle does not become too small or too large with respect to the given base map. Each triangle should be drawn within the boundary of the respective administrative unit of the base map, but in case of unavailability of the map, the triangles should be drawn on the same baseline maintaining uniform distance between them. The diagrammatic representation of proportional scale must contain at least three equilateral triangles showing approximately the largest,medium and smallest quantities of the given data (Fig. 3.11). 3.4.2.3 Square Diagram Square diagram, another important two-dimensional diagram, represents a series of squares in which the size of each square is directly proportional to the quantity it signifies. Unlike rectangular diagram, in which the representation of widely varied data is difficult, in square diagram any quantity of data can be easily and simply represented. 3.4 Types of Diagrams in Data Representation 171 Table 3.10 Worksheet for triangular diagram (Geographical area of selected biosphere reserves in India) Biosphere reserve Geographical area (sq. km) li = √ 4qi√ 3 Scale selected Length of the side of the triangle (cm) Sundarban 9,630 149.13 1 cm to 80 units 1.86 Manas 2,837 80.94 1.01 Nilgiri 5,520 112.91 1.41 Gulf of Mannar 10,500 155.72 1.95 Simlipal 4,374 100.50 1.26 Panchamarhi 4,928 106.68 1.33 For proportional scale Largest 11,000 159.38 1.99 Medium 6,750 124.85 1.56 Smallest 2,500 75.98 0.95 Source Geography of India by Majid Husain, 2014 Fig. 3.11 Triangular diagram (Geographical area of selected biosphere reserves in India) The drawing of the square diagram is based on the theory that area of each square (‘a’) is directly proportional to the quantity it represents (‘q’). Therefore. a α q (3.5) or, a = k.q (k = proportionality constant) If the length of the side of a square having area ‘a’ be ‘l’, then l2 = a (3.6) 172 3 Diagrammatic Representation of Geographical Data Table 3.11 Worksheet for square diagram (Population of selected million cities of India, 2011) Name of the Urban Agglomeration Population li = √ Pi Scale selected Length of the side of the square (cm) Delhi 16,314,838 4039.16 1 cm to 1500 units 2.69 Greater Mumbai 18,414,288 4291.19 2.86 Kolkata 14,112,536 3756.66 2.50 Chennai 8,696,010 2948.90 1.96 Bangalore 8,499,399 2915.37 1.94 Hyderabad 7,749,334 2783.76 1.85 Ahmedabad 6,240,201 2498.04 1.66 For proportional scale Largest 20,000,000 4472.13 2.98 Medium 12,500,000 3535.53 2.36 Smallest 5,000,000 2236.07 1.49 Source Government of India, Ministry of Information: Production Division, India (2012), New Delhi, pp. 77–78 and Geography of India by Majid Husain, 2014 [Area of a square with side length ‘l = l2] or l2 = k.q(a = k.q) l = √ k.q (3.7) So, for any item (‘i’), corresponding to the quantity (‘q’), the length of the side of the square (‘l’) can be explained by the following equation: li = √ k.qi (3.8) For the simplification of the calculation and easy understanding, √ k.qi may be written as √ Pi in the calculation Table 3.11. For the drawing of the squares, a suitable scale should be selected carefully so that the individual square does not become too small or too large with respect to the given base map. Each square should be drawnwithin the boundary of the respective admin- istrative unit of the base map. In case of unavailability of the map, the square should be drawn on the same baseline maintaining a uniform distance between them. The diagram must contain a proportional scale having at least three squares representing roughly the largest, medium and smallest quantities of the given data (Fig. 3.12). 3.4.2.4 Circular Diagram Like triangular and square diagram, circular diagram is also an important two- dimensional diagram. It consists of a series of circles in which the size or area of each circle is directly proportional to the quantity it represents. In this diagram, both 3.4 Types of Diagrams in Data Representation 173 Fig. 3.12 Square diagram (Population of selected million cities of India, 2011)the total figure and the component parts or sectors can be easily represented. The area of each circle is directly proportional to the square of its radius. The working principle for the construction of circular diagram is that the area of a circle (‘a’) is directly proportional to the quantity (‘q’) to be represented. Empirically, a α q (3.9) or a = k.q (k = proportionality constant) If the radius and area of a circle are ‘r’ and ‘a’, respectively, then �r2 = a (3.10) [Area of a circle with radius ‘r ’ = �r2]. or �r2 = k.q (a = k.q) r2 = k. q � r = √ k q � (3.11) So, for any item (‘i’), corresponding to the quantity (‘q’), the radius of the circle (‘r’) can be explained by the following equation: ri = √ k qi � (3.12) 174 3 Diagrammatic Representation of Geographical Data Table 3.12 Worksheet for simple circular diagram (Cropping pattern in India, 2010–2011) Crops Area in million hectares ri = √ Ti � Scale selected Radius of the circle (cm) Rice 45.0 3784.70 1 cm radius to 1500 units 2.52 Wheat 29.25 3051.32 2.03 Jowar 10.4 1819.46 1.21 Bajra 8.8 1673.66 1.12 Maize 6.4 1427.30 0.96 Gram 6.3 1416.10 0.94 Pulses 21.1 2591.59 1.73 For proportional scale Largest 45 3784.70 2.52 Medium 25 2820.95 1.88 Smallest 5 1261.57 0.84 Source Government of India, Ministry of Information: Production Division, India (2012), New Delhi, pp. 77–78 and Geography of India by Majid Husain, 2014 For the simplification of the calculation and easy understanding, √ k qi � may be written as √ Ti � in the calculation table. Therefore, for the construction of circular diagram, radii of the circles are obtained by dividing the absolute figures (respective aggregate values) by the value of pie (�) and taking square root (Tables 3.12 and 3.13). A suitable scale should be selected for the drawing of circular diagram so that an individual circle does not become too large or too small in size. A proportional scale must be shown diagrammatically with at least three circles roughly representing the largest, medium and smallest values of the given data (Figs. 3.13 and 3.14). Based on the nature of data, circular diagrams are of two types: (i) Simple circular diagram or proportional circles and (ii) sub-divided circle or compound circular diagram or angular diagram or pie diagram or wheel diagram. Simple Circular Diagram When the data consists of only one component (Table 3.12) then simple circles are constructed in which each circle represents a single value. The same principles are followed for the construction of simple circular diagram as that of constructing square diagram. The radii of the circles are taken in proportion to the square roots of the given figures following the formula mentioned earlier (Eq. 3.12). In the case of large values of the radii, they are converted to convenient small values by dividing the square roots by a suitable common value (Table 3.12). After the computation of the radii, the circles are drawn carefully keeping in mind that the centres of different 3.4 Types of Diagrams in Data Representation 175 Fig. 3.13 Simple circular diagram (Cropping pattern in India, 2010–2011) Fig. 3.14 Pie diagram (Consumption of different fertilizers in India) 176 3 Diagrammatic Representation of Geographical Data circles put side by side with each other or below each other must be located on the same straight line (Fig. 3.13). Angular Diagram or Compound Circular Diagram or Pie Diagram or Wheel Diagram When the data is composed of a total value and two or more component parts (Table 3.13) then compound circular diagram or pie diagram is constructed (Fig. 3.14). The area of the circle represents the total value and the different sub-divisions or angular sectors of the circle represent the different component parts. In this diagram, 360° angles, made at the centre of the circle, correspond to the total value which is again sub-divided into a number of smaller angles or angular sectors (Fig. 3.14). The degrees of these angular sectors would be directly proportional to the values of the component parts (Table 3.13). The angular or sectoral divisions of different component parts within the circle may be computed by the following formula: sc1 = 360◦ q × c1 (3.13) where sc1 = degrees of an angular segment for component 1, q = total quantity of all the component parts, c1 = quantity of component 1 Here, c1 + c2 + c3 + · · · + cn = q (3.14) and sc1 + sc2 + sc3 · · · + scn = 360◦ (3.15) For the drawing of the angular segments in pie diagram, it is essential to follow a number of logical principles, arrangements and patterns or sequences. As a common procedure, different angular sectors are started to be drawn from a fixed line (gener- ally, from the radius drawn duewest or north) and are arranged according to their size, with the largest at the top and the others running sequentially clockwise (Fig. 3.14). In the pie diagram, the circles and the angular segments are drawn with the help of a compass and a protector. Different angular sectors of the circles representing different components should be neatly coloured or be clearly marked by different signs and symbols in order to make the diagram attractive. A well-organized legend of colours or signs and symbols should be provided to make the diagram meaningful and understandable. 3.4 Types of Diagrams in Data Representation 177 Fig. 3.15 Percentage pie diagram showing the consumption of different fertilizers in India Pie Diagram in Percentage In the case of comparison of data, percentage representation of pie diagram is more appropriate and useful than absolute representation. Because in a series of pie diagrams, it is needed to represent the larger total figure by a larger circle and the smaller total figure by a smaller circle. This type of representation involves difficul- ties and complications of two-dimensional comparisons. But, if the pie diagrams are constructed based on percentage value, then all the absolute totals (including larger and smaller) are considered to be 100 percentages, and hence the size of all the pie diagrams become equal (Fig. 3.15). For the construction of percentage pie diagram, all the component values are converted into percentage with respect to the total value (Table 3.13). In such a situation, 100% value is represented by 360° angular value at the centre of the circle, and hence 1% value is represented by 3.6◦ ( 360 ◦ 100 ) angular value. For example, if ‘P’ is the percentage value of a certain component, then it will be represented by (3.6◦ ×P) degrees as the corresponding angular value. 178 3 Diagrammatic Representation of Geographical Data Ta bl e 3. 13 W or ks he et fo r pi e- di ag ra m (C on su m pt io n of fe rt ili ze rs in In di a, la kh to nn es ) Y ea r C on su m pt io n of fe rt ili ze rs (l ak h to nn es ) To ta l R ad iu s of th e ci rc le r i = √ T i � Sc al e se le ct ed R ad iu s of th e ci rc le (c m ) U re a D A P M O P N PK C om pl ex SS P 19 91 –9 2 14 0. 04 45 .1 8 17 .0 1 32 .2 1 31 .6 5 26 6. 09 29 10 .3 1 1 cm ra di us to 15 00 un its 1. 94 20 00 –0 1 19 1. 86 58 .8 4 18 .2 9 47 .8 0 28 .6 0 34 5. 39 33 15 .7 3 2. 21 20 12 –1 3 30 0. 02 91 .5 4 22 .1 1 75 .2 7 40 .3 0 52 9. 24 41 04 .4 2 2. 74 20 13 –1 4 30 6. 00 73 .5 7 22 .8 0 72 .6 4 38 .7 9 51 3. 8 40 44 .1 0 2. 70 20 14 –1 5 30 6. 10 76 .2 6 28 .5 3 82 .7 8 39 .8 9 53 3. 56 41 21 .1 3 2. 75 Fo r pr op or tio na ls ca le L ar ge st 60 0 43 70 .1 9 2. 91 M ed iu m 40 0 35 68 .2 5 2. 38 Sm al le st 20 0 25 23 .1 3 1. 68 Y ea r C on su m pt io n of fe rt ili ze rs (D eg re e) To ta l( D eg re e) C on su m pt io n of fe rt ili ze rs (% ) To ta l( % ) U re a D A P M O P N PK C om pl ex SS P U re a D A P M O P N PK C om pl ex SS P 19 91 –9 2 18 9. 46 61 .1 3 23 .0 1 43 .5 8 42 .8 2 36 0 52 .6 3 16 .9 8 6. 39 12 .1 0 11 .8 9 10 0 20 00 –0 1 19 9. 97 61 .3 2 19 .0 6 49 .8 229 .8 1 36 0 55 .5 5 17 .0 3 5. 29 13 .8 4 8. 28 10 0 20 12 –1 3 20 4. 08 62 .2 7 15 .0 4 51 .2 0 27 .4 1 36 0 56 .6 9 17 .3 0 4. 18 14 .2 2 7. 61 10 0 20 13 –1 4 21 4. 40 51 .5 4 15 .9 7 50 .9 0 27 .1 8 36 0 59 .5 6 14 .3 2 4. 44 14 .3 4 7. 54 10 0 20 14 –1 5 20 6. 53 51 .4 5 19 .2 4 55 .8 5 26 .9 1 36 0 57 .3 7 14 .2 9 5. 35 15 .5 1 7. 47 10 0 So ur ce St at e of In di an A gr ic ul tu re 20 15 –1 6, G ov er nm en to f In di a; St at e G ov er nm en ts 3.4 Types of Diagrams in Data Representation 179 Disadvantages of Pie Diagrams Though pie diagram is frequently used as a common statistical technique, the construction of this diagram is time-consuming compared to other diagrams espe- cially than bar diagram. Accurate reading and interpretation of a pie diagram become very difficult, particularly when the circles are divided into a large number of compo- nent sectors or the variation between these components is very little. Generally, it is not suitable to construct a pie diagram when the data is composed of more than five or six components or categories. In the case of eight or more components, it becomes very difficult and confusing to differentiate the relative quantities of them represented in the pie diagram, especially when several small sectors having approximately the same size are there. Generally, pie diagram appears upon comparison inferior to other diagrams and curves like compound bar diagram or a group of curves. 3.4.2.5 Doughnut Diagram Like pie diagram, doughnut diagram displays the relationship of component parts to a whole, but it is capable of containing more than one data series (Table 3.14). In this diagram, each set of data is represented by a ring in which the first data set is displayed at the centre and the last data set towards the outside. Similar to the pie diagram, in doughnut diagram component items are represented by individual slices (Fig. 3.16). If we want to demonstrate the changes of different component parts of something, then doughnut diagram will be more appropriate than the other type of diagrams like bar or pie diagrams. Thus, it gives a birds-eye view of the relative changes in each component part of the data series. A doughnut diagram demonstrates different category groups, series groups and series values in the form of doughnut slices. The size of each slice is directly propor- tional to the value it represents in proportion to the total values. In the doughnut’s hole at the centre, the data labels and the totals can be displayed to make it easier to compare different segments. If the data labels are represented in percentage then each ring will total 100%. Doughnut diagrams are of two types: simple doughnut and exploded doughnut. An exploded doughnut diagram is identical to a simple doughnut diagram but the only difference is that in the exploded doughnut, the slices are moved away from the centre of the diagram, resulting in a gap between the doughnut slices. When the Doughnut Diagram Should Be Used 1. If we want to represent more than one data series. 2. No negative value in the data series exists. 3. When the data don’t have more than seven or eight component parts. 180 3 Diagrammatic Representation of Geographical Data Fig. 3.16 Doughnut diagram (Area under different land uses in selected districts of West Bengal) Table 3.14 Database for doughnut diagram (Area under different land uses in selected districts of West Bengal) Name of the districts Area (in thousand acres) Agricultural land Forest land Waste land Water bodies and barren land Land under miscellaneous use Purulia 336.06 75.05 49.27 68.32 87.05 Bankura 367.02 66.02 22.17 75.16 100.2 Paschim Medinipur 467 51.06 57 82 175 Birbhum 304 47 32 52 116 Jalpaiguri 396 82.16 30 69.20 141 3.4 Types of Diagrams in Data Representation 181 Advantages and Disadvantages of Doughnut Diagram The major advantages and disadvantages of using doughnut diagram include: Advantages 1. Multiple data sets can be easily represented in a doughnut diagram. 2. Using this diagram, we can get a birds-eye view of the relative changes of different component items within the data series. 3. Comparison of different component parts using different slices becomes easy. 4. The blank space inside a doughnut diagram can be used to show the information which the diagram actually indicates. Disadvantages 1. Due to their circular shape, doughnut diagrams are not easy to understand, especially when they represent numerous sets of data. 2. In doughnut diagram, the volume of data is not represented accurately by the proportions of outer rings and inner rings. The data points on inner rings may come into view smaller than data points on outer rings though the actual values may be larger or the same. Because of this, it is necessary to display the values or percentages of them in data labels to make them more accurate and useful. Difference Between Pie Diagram and Doughnut Diagram Though pie diagram and doughnut diagram both display the relationship of component parts to a whole but these two are different under the following heads: Pie diagram Doughnut diagram 1. Demonstrates the size differences of component parts to a whole of one data series only. Thus, it is difficult to represent multiple data sets in pie diagram 1. Size differences of component parts to a whole of multiple data sets can be easily represented in a doughnut diagram 2. Proportions of areas of the slices to one another and to the diagram as a whole are significant to compare multiple pie diagrams together 2. Focus more on understanding the length of the arcs of rings rather than comparing the proportions of areas between slices 3. The inner cut out percentage defaults to 0 for pie diagrams 3. The inner cut out percentage defaults to 50 for doughnuts 4. Less space-efficient, as no blank space exists inside a pie diagram 4. Space-efficient, as the blank space inside a doughnut diagram can be used to show information inside it 5. Unable to give a birds-eye view of the relative changes of different component parts within the data set 5. It can give a birds-eye view of the relative changes of different component parts within multiple sets of data 6. Comparison of different component parts is difficult 6. Comparison of different component parts using different slices becomes easy 182 3 Diagrammatic Representation of Geographical Data 3.4.3 Three-Dimensional Diagrams Three-dimensional diagrams are those in which three things, namely length, width (breadth) and height are taken into consideration. Those diagrams are also known as volume diagrams. Some important three-dimensional diagrams are in the following sub-sections. 3.4.3.1 Cube Diagram Cube diagram is an important three-dimensional diagram which is suitably constructed for the representation of the items having wide differences between them, say, smallest and the largest values are in the ratio of 1:1000 (Saksena 1981). In this diagram, the volumes of all cubes would be in the same proportion as the ratio of the actual data given. The construction of cube diagram is based on the theory that the volume of cube (‘v’) is directly proportional to the quantity (‘q’) it represents. Thus, v α q (3.16) or v = k.q (k = proportionality constant) If the length of side and volume of a cube are ‘l’ and ‘v’, respectively, then l3 = v (3.17) [Volume of a cubewith side length ‘l’ = l3] or l3 = k.q(v = k.q) l = 3 √ k.q (3.18) So, for any item (‘i’), corresponding to the quantity (‘q’), the length of the side of the cube (‘l’) can be explained by the following equation: li = 3 √ k.qi (3.19) For the simplification of the calculation and easy understanding, 3 √ k.qi may be written as 3 √ Pi in the calculation table. For constructing cube diagram, at first the cube roots of the data should be calcu- lated with the help of logarithms. Then the logarithmic figures will be divided by the value 3 and the antilog of thisvalue will indicate the cube root. By this technique, the sides of the cubes should be made in proportion to the cube roots of the given figures. If the sides of the cubes are large enough, then they should be reduced to a convenient size by dividing the values of cube roots by a common value. 3.4 Types of Diagrams in Data Representation 183 Fig. 3.17 Steps of construction of cube diagram Steps to Construct Cube Diagram Following steps should be followed to construct cube diagram: 1. At first, a square should be drawn with the length of the side of the cube to be portrayed (Fig. 3.17I). 2. Another square of the same size should be drawn with its lower-left corner coinciding with the centre of the first square. Thus the corresponding sides of the two squares become parallel to each other (Fig. 3.17II). 3. Then the left and right upper corners and lower right corners of both the squares should be joined by straight lines (Fig. 3.17III). 4. Lastly, the left-hand side and the lower side of the second squares should be erased and the resultant figure should be a cube (Fig. 3.17IV). Scale for the drawing of cube diagram should be selected in such a way that none of the individual cubes is too large or too small in size. In case of unavailability of map, the cubes should be drawn on the same baseline with equal intervening space. A proportional scale must be shown diagrammatically with at least three cubes roughly representing the largest, medium and smallest values of the given data (Fig. 3.18). Table 3.15 shows the population of themain seven tribal groups in India according to the 2011 census and data is represented using cube diagram in Fig. 3.18. 184 3 Diagrammatic Representation of Geographical Data Fig. 3.18 Cube diagram (Population of main seven tribes in India) Table 3.15 Worksheet for cube diagram (Population of main seven tribes in India, 2011) Name of tribes Population li = 3 √ Pi Scale selected Side of the cube (cm) Bhil 12,689,952 233.25 1 cm to 100 units 2.33 Gond 10,859,422 221.45 2.21 Santal 5,838,016 180.06 1.80 Mina 3,800,002 156.05 1.56 Naikda 3,344,954 149.55 1.50 Oraon 3,142,145 146.47 1.46 Sugalis 2,077,947 127.61 1.28 For proportional scale Largest 13,000,000 235.13 2.35 Medium 7,500,000 195.74 1.96 Smallest 2,000,000 125.99 1.26 Source Census of India N.B. Cube roots may be calculated as follows: Cube root of a number = Antilog ( Log of the number 3 ) 3.4 Types of Diagrams in Data Representation 185 3.4.3.2 Sphere Diagram Sphere diagram is another important three-dimensional diagram consisting of a series of spheres which are constructed based on the principle that the volume of each sphere (‘v’) is directly proportional to the quantity (‘q’) it represents. Thus, v α q (3.20) or v = k.q (k = proportionality constant) If the radius and volume of the sphere are ‘r ’ and ‘v’, respectively, then 4 3 �r3 = v (3.21) [Volume of a sphere with radius‘r ’ = 4 3�r3] or 4 3�r3 = k.q(v = k.q) r3 = k 3q 4� r = 3 √ k 3q 4� (3.22) For any item (‘i’), corresponding to the quantity (‘q’), the radius of the sphere (‘r ’) can be expressed by the following equation: ri = 3 √ k 3qi 4� (3.23) For the simplification of the calculation and easy understanding, 3 √ k 3qi 4� may be written as 3 √ 3qi 4� in the calculation Table 3.16. For the drawing of sphere diagram, the scale should be selected carefully so that none of the individual spheres becomes too large or too small in size. Spheres are generally drawn within the boundary of the administrative unit of the given map. In the case of the unavailability of maps, the spheres can be drawn on the same baseline with equal distance between them. The diagrammust contain a proportional scale having at least three spheres representing roughly the largest, medium and smallest quantities of the given data. Curved lines should be drawn carefully on the surface of the sphere to represent the parallels and meridians so that they appear as three-dimensional diagrams involving volumes (Fig. 3.19). 186 3 Diagrammatic Representation of Geographical Data Fig. 3.19 Sphere diagram (Urban population of selected states in India, 2011) Table 3.16 Worksheet for sphere diagram (Urban population in selected states in India, 2011) Name of the state Urban population ri = 3 √ 3qi 4� Scale selected Radius of the sphere (cm) Uttar Pradesh 4,44,70,455 219.78 1 cm to 120 units 1.83 Maharashtra 5,08,27,531 229.79 1.91 Bihar 1,17,29,609 140.95 1.17 West Bengal 2,91,34,060 190.88 1.59 Andhra Pradesh 2,83,53,745 189.16 1.58 Madhya Pradesh 2,00,59,666 168.56 1.40 Tamil Nadu 3,49,49,729 202.82 1.69 For proportional scale Largest 6,00,00,000 242.86 2.02 Medium 3,50,00,000 202.92 1.69 Smallest 1,00,00,000 133.65 1.11 Source Census of India 3.4.4 Other Diagrams 3.4.4.1 Pictograms Pictograms are another very important and popular technique in which statistical or geographical data are represented by various pictorial symbols such as sacks, bales, tanks, discs etc. (Singh and Singh 1991) (Table 3.17). This is not the abstract 3.4 Types of Diagrams in Data Representation 187 Table 3.17 Data for pictograms (Production of wheat in different years in India) Year Wheat (Ravi) production (million tonnes) Scale selected Number of pictorial symbols 2004–05 68.6 One pictorial symbol represents 10 million tonnes of wheat 7 2010–11 86.9 9 2011–12 94.9 10 2012–13 93.5 10 2013–14 95.9 10 Year Number of Sacks 2004–05 2010–11 2011–12 2012–13 2013–14 representation of data like lines or bars but it actually depicts the kind of data wewant to represent. This method is more suitable and useful to the layman in representing different statistical and geographical data. In a pictogram, a number of pictures and symbols are drawn to represent different types of data. Principles of Drawing of Pictograms The following points should be kept in mind while a pictogram is constructed: a. Pictorial symbols should usually be of the same size and equal in value. Each picture represents a fixed number of units or a particular quantity (Table 3.17). b. All the pictorial symbols should be self-explanatory. For example, if we want to represent the male population then the symbol should undoubtedly indicate the male population. c. A symbol must indicate the general idea only (like a boy, girl, truck, bus etc.) but not the individual of a species (not Hitler or Akbar etc.). d. All the pictorial symbols drawn should be simple, clear, concise, interesting, easy to understand and easily distinguishable from every other symbol. e. Variations in quantities or numbers should be represented by fewer or more symbols, but not by smaller or larger symbols (Table 3.17). f. All the symbols should be drawn suitably with the size of the paper, i.e. they should not be too small or too large in size. 188 3 Diagrammatic Representation of Geographical Data g. Generally, the pictorial symbols are drawn horizontally (side by side), but they may also be drawn vertically. h. The quantity or the number of units represented by each pictorial symbol should be clearly mentioned. i. Part of a picture may be used to represent the fraction of the total value represented by each picture. Examples To represent 60 million tonnes of wheat produced in a region, six sacks may be heaped together when one sack is supposed to represent 10 million tonnes of wheat. Similarly, to represent 80 aeroplanes in an airport, eight symbols of aeroplane may be drawn together when one aeroplane symbol is supposed to represent 10 aeroplanes. Advantages and Disadvantages of the Use of Pictograms The major advantages and disadvantages of the use of pictograms are: Advantages (1) Pictograms are more attractive and impressive than other types of diagrams. When it is needed to attract the attention of the masses (people) such as in exhibitions, fairs etc. then pictograms are very popular in use. (2) Facts and events represented in a pictorial form are usually rememberedlonger than representation in tables or other diagrammatic forms. (3) Comparison of different data sets becomes easy when they are represented in pictorial form. Disadvantages (1) Drawing of pictograms is very difficult as it requires some artistic sense. (2) Pictograms provide only the overall idea of any fact or event, but they do not offer their minute details. (3) In a pictogram it is required to use one symbol to correspond to a fixed quantity or fixed number of units which may also create problems. For example, if one symbol represents a five lakh population, then the question is that how many symbols are required to represent a population of 27.3 lakhs. 3.4.4.2 Kite Diagrams It represents the change of the percentage cover of geographical phenomena or characteristics over distance. It is most frequently used to show the changes in the percentage cover of different plant species along the environmental gradient (change of environmental conditions with distance). For example, the change of plant species from the edge of a footpath or along a sand dune transect (Fig. 3.20 and Table 3.18), along a coastline etc. can be easily represented in kite diagram. 3.4 Types of Diagrams in Data Representation 189 Ta bl e 3. 18 D at ab as e fo r ki te di ag ra m (N um be r of ve ge ta tio n sp ec ie s al on g th e sa nd du ne tr an se ct s) N am e an d nu m be r of sp ec ie s D is ta nc e in m et re (F ro m se a to in la nd ) 0 10 20 30 40 50 60 70 80 90 10 0 C ou gh gr as s 63 (4 0% ) 46 (2 9% ) 24 (1 5% ) 15 (1 0% ) 8 (5 % ) 0 0 0 0 0 0 D an de lio n 12 (6 % ) 11 (5 % ) 20 (9 % ) 34 (1 6% ) 55 (2 6% ) 32 (1 5% ) 21 (1 0% ) 10 (5 % ) 10 (5 % ) 5 (2 % ) 2 (1 % ) M ea do w gr as s 0 (0 % ) 0 (0 % ) 0 (0 % ) 0 (0 % ) 0 (0 % ) 12 (6 % ) 14 (8 % ) 24 (1 3% ) 32 (1 7% ) 45 (2 4% ) 60 (3 2% ) 190 3 Diagrammatic Representation of Geographical Data Fig. 3.20 Kite diagram showing the number of vegetation species along the sand dune transect Procedures to Draw Kite Diagrams Kite diagrams are drawn using the following steps: 1. At first, we need to draw a scale line to represent the distance covered in the survey. 2. One row is needed to represent each type of plant species. 3. Each and every row requires to follow the same scale and will be wide enough (sufficiently apart from the others) to allow 100% for each plant species and type. 4. Then we have to draw a line through the middle (central line) of each row representing the value ‘0’. 5. At each point of the survey, the percentage value is plotted on both sides above and below the central line to achieve symmetry. 6. Then the obtained points are connected for each row and it gives the diagram having a kite-like appearance. 7. The area between the kite lines are then shaded (Fig. 3.20). Advantages and Disadvantages of Using Kite Diagrams Use of kite diagram to represent geographical data has some advantages as well as some disadvantages: Advantages 1. Very easy to understand and interpret. 3.4 Types of Diagrams in Data Representation 191 2. Clearly shows the changes of different geographical phenomena over distance. 3. Shows the density and distribution of geographical variables. Disadvantages 1. Not suitable for the representation of all types of data. 2. Time-consuming to plot manually. References Saksena RS (1981) A handbook of statistics. Indological Publishers & Booksellers Sarkar A (2015) Practical geography: a systematic approach. Orient Blackswan Private Limited, Hyderabad, Telengana, India. ISBN: 978-81-250-5903-5 Sharma PD (1975) Ecology and environment. Rastogi Publications, Gangitri, Shivaji Road,Meerut- 250002, ISBN: 978–93–5078–122–7 Singh RL, Singh RPB (1991) Elements of practical geography. Kalyani Publishers, New Delhi Chapter 4 Mapping Techniques of Geographical Data Abstract Map is the simplified depiction of the geographical data about the whole earth or a part of it on a piece of plane surface or paper for better understanding of their cartographic characteristics. Maps are the basic tools for geographers and researchers for the visualization of geographic data and understanding their spatial relationships. This chapter explains the basic cartographic terminologies such as Geodesy, Geoid, Spheroid, Datum, Geographic co-ordinate system, Surveying and levelling, Traversing, Bearing, Magnetic declination, Magnetic inclination etc. in a lucid manner with suitable illustrations. It includes the detailed classification and discussion of all types of maps based on their scale and purposes (contents) of preparing the map with special emphasis on Indian Topographical Sheets. All picto- rial andmathematicalmethodsof representationof relief havebeen explained indetail with suitable examples and illustrations. Various types of distributional thematic maps have been analyzed with suitable examples emphasizing their suitable data structure, necessary numerical calculations,methods and principles of their construc- tion, proper illustrations and advantages and disadvantages of their use. Step-by-step and systematic discussion of the methods of construction of maps makes them easy and quickly understandable to the readers and users. Emphasis has also been given on the detailed discussion of techniques of measurement of direction, distance and area on maps. Keywords Mapping technique · Cartographic terminologies · Representation of relief · Distributional thematic maps · Importance and uses of maps 4.1 Concept and Definition of Map Maps are the basic tools for the visualization of geographic data and understanding their spatial relationships. A map is a simplified representation of the whole or part of the earth on a piece of plane surface or paper. It is a two-dimensional depiction of the three-dimensional earth. As the representation of all aspects of the earth’s surface in their actual size and form is quite impossible, a map is drawn at a reduced scale. Maps are drawn in such a way that each and every point on them truly corresponds to the actual ground surface. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Maity, Essential Graphical Techniques in Geography, Advances in Geographical and Environmental Sciences, https://doi.org/10.1007/978-981-16-6585-1_4 193 http://crossmark.crossref.org/dialog/?doi=10.1007/978-981-16-6585-1_4&domain=pdf https://doi.org/10.1007/978-981-16-6585-1_4 194 4 Mapping Techniques of Geographical Data Map can be defined as a reduced (scaled), generalized and explained depiction (image) of objects, elements and events on the Earth or in space, constructed in a two- dimensional plane surface or paper applying mathematically defined relationships (i.e.maintaining correct relative locations, sizes and orientations). The term reduction is related to the length scale of amap, which is the ratio between the accurate length in amap and the corresponding length on ground. Generalization is an obvious outcome of the reduction, as all the particulars cannot be represented in a map in the same detail. The explanation tells us about themodes of expression and appearance through the use of legend. Map, in other words, is the representation of the earth’s pattern as a whole or a part of it, or the heavens on a two-dimensional flat surface following suitable scale and projection using conventional symbols so that each and every point on it truly corresponds to the actual terrestrial or celestial position (Fig. 4.2). Three- dimensional maps can be made using the modern computer graphics only. Globes are maps portrayed on the surface of a sphere. Map illustrates information about the earth in a simple andvisualway. It serves two functions; act a spatial database and a communication device. Basic map features say to the user where an object or event is (its location) and what the object or event is (its characteristics). The amount of information to be depicted on themap depends on: (a)scale of the map, (b) projection used, (c) methods of map-making (d), conventional symbols and (e) skill and efficiency of the draughtsman or map maker etc. 4.2 Concept of Plan A plan is the graphical representation of various aspects on or near the surface of the earth on a horizontal plane to a large scale. The curvature of the earth is not taken into consideration in plan. Therefore, it is suitable for smaller areas to avoid distortions related to the curvature of the earth’s surface. The main purpose of a plan is to precisely and unambiguously capture all types of geometric features of an area, place, building or component (Fig. 4.1). 4.3 Difference Between Plan and Map The differentiation between plan and map is not very easy as it is arbitrary in nature. Main areas of distinction include: Plan Map 1. Graphical representation of features on or near the surface of the earth on a plane or flat surface to a large scale. Scale is 1 cm = 10 m or <10 m 1. Graphical representation of the whole or part of the earth on a plane surface to a small scale compared with the plan. Scale is 1 cm = 100 m or >100 m (continued) 4.3 Difference Between Plan and Map 195 (continued) Plan Map 2. Plans are commonly used in technical fields like architecture, engineering, planning etc. 2. Maps are commonly used to depict geography 3. Horizontal distances and directions are generally shown on a plan 3. In a number of maps, vertical distances (elevations) are also shown along with the horizontal distances and directions. For example, on a topographical map, elevations are shown by contour lines 4. A plan is drawn for small areas. For example, plan of a house, plan of a market complex, plan of a college campus etc. 4. A map is drawn for large area. For example, map of Asia, map of India, map of West Bengal etc. 5. In plan, details are given in the form of symbols 5. A map contains lots of important information of the area Fig. 4.1 Plan of a college campus 196 4 Mapping Techniques of Geographical Data 4.4 Elements of a Map Several important elements are there that should be incorporated whenever a map is prepared for the better understanding and interpretation of the map by the viewers. A few maps may have more than this just basic information, but all maps should contain five basic elements like Title, Grid, Scale, Legend and North Arrow. These elements of a map have an important role to describe map details. 1. Title Title is one of the fundamental features of a map and is very important because it lets the viewers know the general subject matter of the map and what geographic area the map represents. A short and catchy ‘title’ might be appropriate if the readers have knowledge about the theme presented on the map. The suitable title, whether small or long, should provide an answer to the viewers to their ‘What? Where? When?’ The title ‘Sediment yield in global rivers’ quickly says to the readers the theme and location of the data represented in the map (Fig. 4.2). Fig. 4.2 Elements of a map (Source Sediment yield in global rivers, Milliman and Meade 1983) 4.4 Elements of a Map 197 2. Grid Geographic grid system or latitude and longitude marks are really very helpful to the viewers to identify the exact location of a place or object onmap.A grid is represented by a series of vertical and horizontal lines running across the map representing longitudes and latitudes, respectively (Fig. 4.2). Latitude lines (parallels) run east– west around the globe while the longitude lines (meridians) run north–south. The points of intersection of parallels andmeridians are called co-ordinates. The parallels and meridians are set up with letters and numbers indicating the values of latitudes and longitudes. On large-scale maps (objects and phenomena are shown in greater detail), the grids are generally assigned with letters and numbers. Segments (boxes) of the grid may be identified as A, B, C etc. across the top and 1, 2, 3 etc. across the left side of the map. If a stadium is located in B4 box of the grid, and it is mentioned in the index of the map, then the viewer easily finds the stadium by having a look at the box where column B and row 4 cross. 3. Scale The scale represents the relation between a specific distance on the map and the actual distance in the real world, i.e. on the ground. Three main methods are there to represent map scale such as (1) Statement or Verbal Scale (i.e. 1 cm on map is equivalent to 5 km on ground or 1 inch on map is equivalent to 10 mile on ground etc.), (2) Numeric or Ratio Scale (i.e. 1:10,000, it means that each one map unit represents 10,000 units on the real world or a distance of one inch on the map equals 10,000 inches on real world or a distance of one cm on the map equals 10,000 cm on real world) and (3) Graphical Scale (ratio of map distance and ground distance can be shown graphically in the form of a scale bar like linear scale, diagonal scale etc.) (Fig. 4.2). In case of computer-generated maps, the graphical form of representation of scale is generally preferred. The maps that are drawn without following scale are required to have a ‘Not to scale’ notation. 4. Legend Cartographers use different symbols and colours to represent various geographic features. For example, black dots to represent cities; various sorts of lines to represent national and international boundaries, roads, rivers etc.; green colour for forest; blue colour forwater etc. The legend is the key element of amap describing all unknown or unique symbols and colours on the map. The legend acts as the decoder and explains what the various symbols and colours used in the map represent. Descriptions spec- ifying any colour combinations, symbology or categorization are clearly explained in legend. Without the legend, it would be difficult for the viewers to understand the symbols and colours used in the map. For example, in a land-use/land-cover map, various land-use and land-cover cate- gories are represented by different colours. The map would make no sense regarding the land-use/land-cover pattern to the viewer until proper legend is given on the map. In Fig. 4.2, the legend helps the viewer to understand the amount of annual sediment yield in different river basin areas in the world. 198 4 Mapping Techniques of Geographical Data 5. North Arrow The north arrow or compass rose indicates the orientation of the map, i.e. to indicate the cardinal points (also called cardinal direction) of north, south, east and west (four main points of a compass; detail discussion is given later) and maintain a connection to the data frame (data frame is the part of the map displaying the data layers). As that data frame is rotated, the north arrow also rotates with it. It helps the viewer to recognize the right direction of the map as it is related to due north (cardinal direction may also be indicated by first putting the word “due”). Though few exceptions are there, but in most of the maps due north tends to be oriented towards the top of the sheet (Fig. 4.2). Other important elements of a map include: 6. Inset map or Locator An Inset map or locator is a smaller map placed on the main map to further aid the viewer. It is one type of reference map, which might show the relative location of the main map. An inset map might also display a detailed, zoomed in portion of the main map. 7. Labels The words identifying the locations on the map are called labels. They show different places (streets, rivers etc.) and establishments with their distinct names (Fig. 4.2). 8. Citation The citation section of a map represents the metadata (description) of the map. This is the area that contains information such as data sources, date of creation and map projection etc. Citations facilitate the users to determine the use of the map for their own purposes (Fig. 4.2). 4.5 History of Map-Making Maps are not the discovery of the modern human being. The history of mapping the earth is as older as the history