Statistical Analysis System
SAS, provides a list of procedures to perform descriptive statistics. PROC CONTENTS, PROC MEANS, PROC FREQUENCY, PROC UNIVARIATE, PROC GCHART, PROC BOXPLOT, PROC GPLOT, PROC PRINT. It prints all the variables in a SAS data set.
PROC CONTENTS. It describes the structure of a data set.
PROC MEANS. It provides data summarization tools to compute descriptive statistics for variables across all observations and within the groups of observations.
PROC FREQUENCY. It produces one-way to n-way frequency and cross-tabulation tables. Frequencies can also be an output of a SAS data set.
PROC UNIVARIATE. It goes beyond what PROC MEANS does and is useful in conducting some basic statistical analyses and includes high-resolution graphical features.
PROC GCHART. The GCHART procedure produces six types of charts, block charts, horizontal vertical bar charts, pie-donut charts, and star charts. These charts graphically represent the value of a statistic calculated for one or more variables in an input SAS data set. The ched variables can be either numeric or character.
PROC BOXPLOT. The BOXPLOT procedure creates a side-by-side box-and-whisker plot of measurements organized in groups. A box-and-whisker plot displays the mean, quartiles, and minimum and maximum observations for a group.
PROC GPLOT. GPLOT procedure creates two-dimensional graphs, including simple scatter plots, overlay plots in which multiple sets of data points are displayed on one set of axes, plots against the second vertical axis, bubble plots, and logarithmic plots.
Hypothesis Testing in Statistics
The population of hypothesis testing is to choose between two competing hypotheses about the value of a population parameter. For example, one hypothesis might claim that the wages of men and women are equal, while the other might claim that women make more than men. Hypothesis testing is formulated in terms of two hypotheses.
- The Null Hypothesis is referred to as H-null. An alternative hypothesis, which is referred to as H-1. The null hypothesis is assumed to be true unless there is strong evidence to the contrary.
- The Alternative Hypothesis is assumed to be true when the null hypothesis is proven false.
Hypothesis Testing Procedures
Let’s learn about hypothesis testing procedures. There are two types of hypothesis testing procedures. They are parametric tests and non-parametric tests.
Parametric Tests: In statistical inference or hypothesis testing, traditional tests such as t-tests and ANOVA are called parametric tests. They depend on the specification of a probability distribution except for a set of free parameters. In simple words, you can say that if the population information is known completely by its parameter, then it is called a parametric test.
Non-Parametric Tests: If the population or parameter information is not known, and you are still required to test the hypothesis of the population, then it’s called a nonparametric test. Nonparametric tests do not require any strict distributional assumptions. There are various parametric They are as follows. t-test, ANOVA, chi-squared, linear regression. Let’s understand them in detail.
- T-Test. a t-test determines if two sets of data are significantly different from each other. The t-test is used in the following situations. To test if the mean is significantly different than a hypothesized value. To test if the mean for two dependent or paired groups is significantly different.
- ANOVA. anova is a generalized version of the t-test and is used when the mean of the interval-dependent variable is different from the categorical independent variable. When we want to check the variance between two or more groups, we apply the ANOVA test.
- Chi-square. Chi-square is a statistical test used to compare observed data with data you would expect to obtain according to a specific hypothesis.
Linear regression. There are two types of linear regression, simple linear regression and multiple linear regression. Simple linear regression is used when one wants to test how well a variable predicts another variable. Multiple linear regression allows one to test how well multiple variables, or independent variables, predict a variable of interest. When using multiple linear regression, we additionally assume the predictor variables are independent.For example, finding relationship between any two variables, say sales and profit, is called simple linear regression. Finding relationship between any three variables, say sales, cost, telemarketing, is called multiple linear regression.