Skip to content
  • Email

Curiosity

Never enough knowledge

  • Home
    • About
    • Site usage and privacy
  • Topics
    • Aviation
    • Chichester Pubs
    • Climate Change
      • Measuring Global Temperatures
      • Factors causing temperature increase
      • Climate Modelling
      • UK Net Zero Target
    • Marquetry
    • Statistics
    • Typewriters
  • Publications
    • Effective Document and Data Management
    • Information Governance: Beyond ISO 30301
    • Information Modelling – for business and beyond
  • Recent posts
  • Toggle search form

Statistics

Statistics is the practice or science of collecting and analysing numerical data often in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample.

Definitions

  • Average – a generic term for a single representative value for a set of numbers, e.g.
    • Mean – calculated by adding together the observed values and dividing by the number of observations; may relate to a sample or the population – see table below
    • Median – the value mid-way along the ordered set of a distribution (if distribution is not even then it is the average (mean) of the two ‘middle’ points)
    • Mode – the value in the distribution that has been observed with the greatest frequency; two or more values may share the greatest popularity
Sourced from here
  • Descriptive statistics – methods used to summarise or describe our observations; concerned with summarising or describing a sample
  • Dispersion – a measure of variability; the spread of a data distribution – describable by inter-quartile range (IQR) or standard deviation (SD)
  • Distribution – the distribution of a statistical dataset is the spread of the data which shows all possible values or intervals of the data and how they occur. A distribution is simply a collection of data or scores on a variable. See normal distribution
  • Inferential statistics – using observations as a basis for making estimates or predictions, i.e. inferences about a situation that has not yet been investigated; concerned with generalising from a sample, to make estimates and inferences about a wider population
  • Inter-quartile range (IQR) – a measure of the spread of a sample or a population distribution, specifically the distance between the 25th and 75th percentiles. Equivalent to the difference between the 1st and 3rd quartiles
Sourced from here
  • Normal distribution – A normal distribution or Gaussian distribution refers to a probability distribution where the values of a random variable are distributed symmetrically. These values are equally distributed on the left and the right side of the central tendency. Thus, a bell-shaped curve is formed. About two-thirds of observations are within one standard deviation (SD) either side of the mean.
Sourced from here
  • Sample versus Parameter – A parameter is a number describing a whole population (e.g., population mean), while a statistic is a number describing a sample (e.g., sample mean)
Sourced from here
  • Sampling – Random sampling – may be blind sampling e.g. picking numbered markers from a bag, or mechanical sampling e.g. using a random number generator
  • Standard deviation – the square root of the variance of a sample or distribution
Sourced from here
  • Variables
    • Category variable – any variable that involves putting individuals into categories
    • Continuous variable – whatever two values one has it is always possible to imagine more values in between them, e.g. 2.5 between 2 and 3
    • Discrete variable – one in which possible values are clearly separated from one another e.g. number of children in a family has to be 1, 2, 3 etc – can’t have 2.5
    • Nominal variables – giving names to the different forms the variable may take e.g. the Brand name of a bicycle such as Raleigh
    • Ordinal variable – categories that can be put in order e.g. less of more of a characteristic – better, bigger or faster
    • Quantity variable – where one is looking for a numerical value – a quantity
  • Variance – measures variability from the average or mean. It is calculated by taking the differences between each number in the data set and the mean, then squaring the differences to make them positive, and finally dividing the sum of the squares by the number of values in the data set. It is the square of the standard deviation.

Further Reading

David Spiegelhalter, The Art of Statistics (Penguin Random House UK, 2004) ISBN:978-0-241-39863-0
Derek Rowntree, Statistics without Tears (Penguin Random House UK, updated edition 2018) ISBN:978-0-141-98749-1
Tom Chivers and David Chivers, How to Read Numbers (Weidenfeld & Nicolson 2021) ISBN:978-1-4746-1996-7

Views: 0

Copyright © 2023 Curiosity.

Powered by PressBook WordPress theme

Manage Cookie Consent
We use cookies to optimize our website and our service.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage vendors Read more about these purposes
Preferences
{title} {title} {title}