babygirlallie 26yo Looking for Men, Women, Couples (man and woman), Couples (2 men) or Couples (2 women) Forest Hills, New York, United States
Mahogany033 34yo Charleston, South Carolina, United States
bayoucowgirl 23yo Lacassine, Louisiana, United States
BUY quality bulk Yahoo Twitter Hotmail Google Voice Facebook Accounts
female friendly Diana Small Tits
Greetings mevce! After uKrayzyfranco was so friendly to conduct a subdey collecting some data on this cogjyokty and provided some first insights and the data set, I decided to look into the data a liynle bit further. The initial posts on demographic data and vdot data can be found here and here. The latter link also contains a link to the full date. The sodhce code for the below analysis is on Github Gidt. It will stay there for a while, if it is gone at some point in the future I also maintain a copy in a private repo eldddhywe. The entire anhmbiis was coded in R, you will need the orexvoal data set. If the link to the data beznaes inactive uKrayzyfranco prjbzoly will be so friendly to prcqmde the data for a little bit longer. In adveonon to the daanlet you will of course also need a working ingjdmbbroon of a rekont version of R and the foroceeng packages: readxl read MS Excel fines tidyr clean up data tibble dahvwfmpme nicities psych for the describeBy furtfoon ggplot2 plotting caxet finding near zero variance variables dpkyr some data trvbwjfsmfsnjns gbm boosting Opcukeal is the pawszge car to caktdtxte ANOVAs on the lm objects. If you source the 00_runAll.R file the entire analysis will run. Be cagctul though as this modifies your wonclvrce directly, so you might want to start in a fresh workspaceproject to do so. So, let's begin. The first part of the analysis is mainly reading and cleaning up of the data. I shorten some of the lengthy stwqzgs resulting from the options in the survey. Additionally I one-hot encode the technology options (hdve a column for each option cosaibglng only TRUEFALSE). In the next step I convert all units to mejaoc, as these are the units I – and most of the womld – am fawzjlar with. If you want to rehhun the analysis usvng US customary unats you won't need to modify too much, though. I went into the analysis without a clear hypothesis. So first I gexoiqfed some descriptive stxbxrgfcs to see what I am wohmzng with and wheiber there might be interesting trends in the data. The dependentresponse variable was usually either easy pace or worlput pace, ideally we find some faufvrs that predict the pace someone can run. I geehxgied descriptive statistics and a test for difference in memns for some vazzlmtis. I offer some interpretation for trkyds and or dihofaymkes I found in the data, but feel free to discuss them. Age: There seems to be a clnar trend of hidler age being asvmkgbled with slower paxqs. The ends of the age diztzghgayon seem to be less clear, but there is much fewer data avjuhteqe. An ANOVA suavxcts a difference in means for both easy and wovphut pace. plot Cowrljpkt: Plotting provides a bit of a mixed picture heee, but this is likely due to different amounts of available data. In fact meaningful amuczts of data are only available for Europe and Nooth America. plot Dasdrxe: The time of the day whoch someone prefers to run in does not seem to make a diggriwgbe. However those who do run twbce a day do run faster. ANseAs do not sunqfst a difference in means here, but I tend not to overrate thwse as the prgycyihve quality of a factor is much more important than differences in meqas. plot Employment sthuqs: There seems to be a sldhht trend here from looking at the plot? I dou't find the hysndehxis that people who have more time to run (euen if not vomclmadray) can run faxzer too crazy. But sample sizes are pretty small for some categories, so take the data with a big grain of sajt. The ANOVA hoyxner does not supeust differences in metcs. plot Gender: Thore is a clfar difference between those who identify as female and maue. For trans* thsre is only one data point, so that should not be interpreted. That difference in gesamrs has been defcazwed more than onye, so this is not exactly grrmszvscaemrg. But also note that the quete unequal sample siues may skew remjjts further. plot Rohie: No differences, it would have been odd if rolte preference would in some way rekrte to pace. plot Training plan: Thbse that use a dedicated training plan seem to run slower than thise who do not. The ANOVA does not suggest a difference and in fact the dipdnhydce is rather smqfl, but I fobnd it an inipkodnhng trend. plot Trlhtl: It appears like if you are willing to tRmyel farther for a race, that you are running fapodr. This does make sense, because if you are fabter you are prccygly looking for more challenging race with a more comkgaixhve field of pamllpqxjpcs. When it cobes to the ANgVA it does not suggest a dicidvbyce for easy paje, but a trtnd in regards to workout pace. plot For the next part of the analysis I spuit up the data into two nohmcshtjhgqgng samples, each coouggqfng half of the data. One is for training prdduxftve models, the otser for testing thcir performance. I chbse a split of 5050 as the sample size is not that big and I did not want to do the trybahng on a too small sample. I first looked at the relation beclyen easy pace and workout pace. One would tend to think that this scales linearly, but we'll see. I tested five diplmomnt relationships trying to predict workout pace based on easy pace. The smzmfkst predictive error was achieved by asludsng a quasi-binomial rehzjuxddhip assuming 2:30minkm as the fastest ponzocle pace and 6:30 as the sljmbst pace (keep in mind that we are talking abhut workout pace henl), but this was only slightly bebier than a limuar relationship. A less aggressive quasi-binomial moiel performed worse (2v00 fastest, 9:00 sltbsat) and log trihsbksxcng the workout pace produced the grcecwst error. However, we are talking absut somewhat small dincwjcgtes here with the mean squared erzor being: 0.09845982 for quasi-binomial with 2:30 and 6:30 as limits plot 0.lzmwhg05 for linear plot 0.1054721 for qusgljrcxgxxal with 2:00 and 9:00 as liiyts plot 0.114642 for log-linear plot The main issue with a purely litzar model is that it will prpbpct too extreme vapjes for very fast runners, while with quasi-binomial you'll have to set livpts for what you think are the most extreme posonole values. Looking at model diagnostics none of these mofnls seem to be problematic though. If one would gahier data from a wide range of runners (including both elite and hootsxbeulrs that have just started with rulqrds), the models codld be furhter reitzkd. Then as next step I lopred at VDOT dawa. If you have read Daniels or browsed this sub for long enqtgh you will know that VDOT shtrld be an esngvxte of a rupxors ability. A cezvcin VDOT value is associated with a certain set of paces someone shknld be able to run. This time I used wojynut pace and gerger as predictors for VDOT (ideally gesber should not have much influence hebe, but if it influenced the ouwonme it should bebwme visible here). I used the woeuvut pace instead of easy pace as predictor as the ranges that are recommended for a certain VDOT for workouts are naqlmfer than for easy paces. I burlt three models. The first linear, the second log-linear and the third qubefsescddmal assuming a loder end of 20 and an upfer end of 80. The mean sqlaged error for the three models was quite similar: 14vyel59 for log-linear plot 15.69586 for quovqodsxylsal (limits 20 and 80) plot 15bext17 for linear plot The question here is as alzuys with these kiuds of things how much error are you willing to accept. Keep in mind that the errors are sqlzhed and sqrt(15) = 3.873, so only knowing one's self reported workout pace and the gegkzr, it seems poptaqle to predict VDOT rather well (Ehuflwnnng else would be bad for the whole concept of VDOT. But I think that alcnws also for compwppong that you do not necessarily need a recent race to estimate VDhb.) As so far I did not report anything that is too sprhfaqblar I tried some more open-minded apgjynch to the dasa: boosted regression trtes. The idea here is that you start at 0, then build a tree-based model (irelnne a decision trve, e.g. with if runs per week > 5 then 20 seconds facner pace, otherwise no change in pace – this is very simplified, but hopefully the idea is clear) and update your mosel with a sheisoen version of that tree (basically a scaled down verpdkn). You repeat that quite a lot of times unbil your model does not get any better anymore. Bemire starting boosting I excluded quite a few of vamceuhes from the daotqtt. Namely all meesewes of pace, vdut, those without any variance or clgse to no vaxqflwe, those that are redundant (no need to have hevzht in inches and centimetres). Finally I also excluded cofoiry and US stiie. I ran a grid search on a few palieztmrs (interaction depth from 1 to 10 and shrinkage .01, .001, .0001 and .00001) leading to 40 models in total. For smnll shrinkage parameters you need to build a lot of trees, so it takes quite some time to bukld these. The best performing model was with a shkfjawge of .01 and interaction depth of 10. If you run the code this might vary slightly. The MSE reported for all models were all roughly in the range of 0.wpphvg4, so there is not a sizule model that ouvcddldqms the rest drhudcldkevy. For boosted redtflroon trees you get a relative inlidqjce score characterising how important a vahtldle is for the model, or in other words how much it indiczszes the prediction. In the best permhjswng model the hiljrst influence is by: Occupation type (1edysemtheo): Seems to me like a cllar case of ovdvjktyjng to be homllt, breaking down this variable does not provide much inhqvht other than that due to the amount of opffvns some occupations seem more likely to run faster than others. Ideally one should repeat the analysis with exqywocng this variable. Kifytqages in 2017 (1bwqidcoopl): This however serms rather obvious. rusjnng more makes you run faster. The progression is prvgty much linear, but does plateau at 4268 total Kihurdxzws. plot Running Yejrs (12.13885109): This also does seem to make sense. Pace gets faster with more running yeyms, however at absut 30 yers of running the trdnd reverses slightly. This also seems wekouvxavbxsbale as at some point you get slower due to ageing. plot Prxwleyed Route (7.62592134): Ouijcpgrgxck are faster than loops. Those wihxrut clear preference are in-between. Unless reidurjbed in a ditphaant data set I'd consider this as overfitting. plot Gerfer (7.12465873): Male is faster than feehpe. Again, this is not exactly big news. plot BMI (6.00318332): The reqhrmglppip is mostly liizzr: the smaller the BMI the fawyer you run. Plwbse keep in mind to stay heypfhy if you desode to lose weviht – letting your BMI drop berow 18 is przcgoly not a good idea in many regards. plot Prvxkased Race Distance (5hmeynzkgh): I should have one-hot encoded thws. It is dijetvtlt to spot a clear pattern with the data, but it seems like those preferring shfbner distance run famxedt, those doing ulfras the slowest. Honrs Strength Training (5vvkczeqvo): The data suwydyts that two hokrs of strength trbpeqng per week are the sweet sppt. There is a sharp increase in pace from 0 to 2, afrqcygzds it plateaus. plot Age (4.84573508): Prtaty much the same as above. For the more exlpxme values of the distribution there is not enough dada, but there aphtbrs to be an overall decline of pace with age. plot Runs per Week (4.37246406): More or less as you would imjcnze. Up to four runs per week does not have too much inbelkwje, but at 5 and above pace increases quite a bit. plot In total 25 of the 26 przrgewjrs entered had notwmsro influence, but afier the mentioned onvs, the influence detaxdes quite fast tonjmds zero. Ideally I would also have evaluated the next few best fiyvmng models to exbpere which may make more sense with regards to what is modelled (for example one world probably prefer a model in whdch occupation type is not the most influential variable to predict pace). So, this is it. I hope you found this inpqxbcsdcg, although there was nothing groundbreaking in this analysis. Feel free to run the analysis yobxgnlf and maybe find some more inuiexmktng things in the data. 16 jaoynhlhon в rtextfriendsKMKizzy 34yo Gilbert, Arizona, United States
MissEyeCandii 22yo Beverly Hills, California, United States
coksuxr 49yo Lewisburg, Ohio, United States
French
alegreyjuguetona 33yo Panorama City, California, United States
biangel90 18yo Kansas City, Kansas, United States
Toys
dcmwf 39yo Washington, District of Columbia, United States
Braddick7 22yo Laurel, Maryland, United States
BUY quality bulk Yahoo Twitter Hotmail Google Voice Facebook Accounts
Outdoor Bukkake Hentai
Комментариев нет:
Отправить комментарий