Posts

Showing posts from October, 2017

Around Here Forty Two and Forty Three: 10/14-10/27

Image
A look into what it is like to live in this wild, beautiful, chaotic whirlwind that is our life right now. one of my Spanish 3 Honors students, Mattison, is a seriously talented artist Intentional Outdoor Hours : 517 hours+  (of 1000) I've surpassed my outdoor hour count of last year (510 hrs all year!) I'm pretty sure I'm not going to reach 1000 before the end of the year, but I'm already pretty pumped that I beat last year's attempt. Always moving forward, right? I snagged some outdoor time this week in the beautiful sometimes crisp, sometimes warm, sometimes downright cold air doing all sorts of activities related to the season, including spending an evening well into dark raking and jumping into leaf piles illuminated by the light of the tractor headlights (hah!) Reading and finishing Tenth of December by George Saunders and loving it! We then got to enjoy our monthly book club night, this time at The Windber Hotel on trivia night where we ate, chatted about th...

CDF and PPF in Excel, R and Python

 How to compute the cumulative distribution functions and the percent point functions of various commonly used distributions in Excel, R and Python. I use Excel (in conjunction with Tanagra or Sipina), R and Python for the practical classes of my courses about data mining and statistics at the University. Often, I ask students to perform hypothesis tests or to calculate confidence intervals, etc. We work on computers, it is obviously out of the question to use the statistical tables to obtain the quantile or p-value of the commonly used distribution functions. In this tutorial, I present the main functions for normal distribution , Student's t-distribution , chi-squared distribution and Fisher-Snedecor distribution . I realized that students sometimes find it difficult to match the reading of statistical tables with the functions they have difficulty identifying in software. It is also an opportunity for us to verify the equivalences between the functions proposed by Excel, R (sta...

The "compiler" package for R

It is widely agreed that R is not a fast language. Notably, because it is an interpreted language. To overcome this issue, some solutions exists which allow to compile functions written in R. The gains in computation time can be considerable. But it depends on our ability to write code that can benefit from these tools. In this tutorial, we study the efficiency of the Luke Tierney's “compiler” package which is provided in the base distribution of R. We program two standard data analysis treatments, (1) with and (2) without using loops: the scaling of variables in a data frame; the calculation of a correlation matrix by matrix product. We compare the efficiency of non-compiled and compiled versions of these functions. We observe that the gain for the compiled version is dramatic for the version with loops, but negligible for the second variant. We note also that, in the R 3.4.2 version used, it is not needed to compile explicitly the functions containing loops because it exists a JI...

Around Here Forty-One: 10/07-13

Image
A look into what it is like to live in our home just this minute. Intentional Outdoor Hours :  496+ hours (of 1000) I did terrible this week (only up 3 hours!) It was pretty chilly and kind of rainy, so my level of interest in being outside was at an embarrassing low. I scored some time though with football, volunteering, and playing in the yard. Reading The Tenth of December by George Saunders and eating it up. I really loved the short story Home and Saunders' ability to create characters and a world that you can become so invested in within a short story of a few pages. I lent out a few of my favorite books to one of my favorite friends this week ( Dear Mr You , Dark Matter , and Big Magic ) because we are nerds and wanna talk about all the books all the time. Beginning the new year of Sunday school. I'm teaching grades four and five again this year and Grey moved up to the exciting year of second grade (first reconciliation and first Holy communion year!) and Gem started K...

Regression analysis in Python

Statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. In this tutorial, we will try to identify the potentialities of StatsModels by conducting a case study in multiple linear regression. We will discuss about: the estimation of model parameters using the ordinary least squares method, the implementation of some statistical tests, the checking of the model assumptions by analyzing the residuals, the detection of outliers and influential points, the analysis of multicollinearity, the calculation of the prediction interval for a new instance. Keywords : regression, statsmodels, pandas, matplotlib Tutorial : en_Tanagra_Python_StatsModels.pdf Dataset and program : en_python_statsmodels.zip References : StatsModels : Statistics in Python

Around Here Thirty-Nine and Forty: 09/23-10/06

Image
Each night as we tuck the kids into bed, we have this baffling moment where we look at each other and ask, 'wait, what just happened?' Whole days swallowed up in the fast paced whirlwind that our life currently is.  There are flashes that we hold on to: Grey cracking up about anything Rusty does and reading so fluently now, a proud Gemma announcing that she had 'no cries' at school and is soooooo close to getting on the letter wall, Violet's scrunched up nose giggle when we give her a big snuggle that turns into a tickle, and Rust Man running with arms held high and open for a hug. It is a good time right now, gosh - these kids, and their smiles and their love for us and each other; their general enthusiasm about life.  And this marriage that feels like we're camping out in a foxhole together awaiting the next attack from the kids, or life, or work - but we have each other's back so we're going to be okay.  But it's definitely a whirlwind and I'm...