Do running shoes weaken muscles?

We have all seen the claims on blogs, in articles and on places like You Tube that running shoes weaken muscles and that is why we should not be using the big bulky motion controlling running shoes. The claims are made quite regularly and with a certain amount of assertiveness that you would have to believe that those making the claims actually have some evidence to back up the claims, but they don’t. Don’t you think that the onus is on those making the claims to come up with the evidence? Have you noticed that they never do?

Do running shoes weaken muscles? Consider these points:

  • There is no evidence that they do
  • If a non runner starts running tomorrow in the most bulky motion controlling running shoe, surely their muscles are going to get stronger and not weaker? How is that shoe actually going to weaken the muscles if they are using the muscles more by running?
  • Critics of the big bulky motion control running shoes like to point out that the evidence shows that, in general, they do not really control motion (and they are right; I will do a future article on that). If the shoes are not controlling any motion, then the foot must be moving, so how is that going to weaken the muscles? Do you see the hypocrisy of having this one both ways?
  • As discussed here a paper presented at the 2012 ACSM meeting that showed that: “ barefoot running does not result in greater activation in these muscles compared to running shod. This suggests that barefoot running may not result in strengthening of the foot intrinsic muscles“, so if there is no difference in muscles activation, how do shoes weaken the muscles?
  • The most motion controlling footwear are probably ski boots; no one is raising concerns nor is anyone seeing an epidemic of weak feet in skiers!
  • If it was the case, then you would expect to see more pronated/flat feet in runners compared to the general population. There is no evidence nor are there any reports of more flat/pronated feet in runners compared to the general population.
  • Three studies have looked at foot orthotics and muscle strength. Two have shown an increase and one no change, so the evidence is that foot orthotics do not weaken the muscles. As running shoes are allegedly less supportive than foot orthotics and if foot orthotics don’t weaken the muscles than how do running shoes do it?

I have no doubt that barefoot or minimalist running does strengthen some muscles more than running in shoes, but that does not mean they were weak to start with and neither does it mean that running shoes weaken muscles. In fact, it has been shown that the anterior tibial muscle is less active in forefoot striking, so that means that this muscle will get relatively weaker in those who forefoot strike while barefoot or minimalist running. Some muscles are used more to forefoot strike and some are used less, so this makes a mockery of the blanket claim that barefoot running strengths the muscles, when some muscles are used less!

What does the evidence say about running shoes and muscle strength? We now have this study:

Athletic training with minimal footwear strengthens toe flexor muscles
Jan-Peter Goldmanna, Wolfgang Potthast & Gert-Peter Brüggemann
Footwear Science
During the propulsive phase of human locomotion, long and short toe flexor muscles (TFM) are exposed to mechanical stimuli caused by ground reaction forces. Further, flexible footwear seems to facilitate increased loading on foot structures. The purpose of the study was to evaluate the effects of high intensity athletic training with minimal footwear on TFM strength. Forty-seven female sport students participated and were randomly divided in three groups: the experimental group (EG; n = 18; 25 ± 5 yrs, 59 ± 6 kg) and the training control group (TG; n = 18; 23 ± 2 yrs, 64 ± 6 kg) performed high intensity athletic training (3 weeks, 5 times per week, 30 min per session) on the forefoot. The EG wore a minimal shoe, the TG performed the exercises with traditional training shoes. The basic control group (CG; n = 11; 27 ± 5 yrs, 63 ± 7 kg) participated in no training programme. To evaluate the training effects on TFM strength, maximum metatarsal phalangeal joint (MPJ) plantar flexion moments during maximal voluntary isometric contractions (MVIC) at 0° and 25° MPJ dorsal flexion were measured in a custom made dynamometer before and after the training intervention. The results showed that (1) in 0° MPJ dorsal flexion, MPJ moments were significantly increased in the EG (p < 0.01) and TG (p < 0.05) and differed significantly to the CG (p < 0.05); (2) in 25° MPJ dorsal flexion, TFM strength was significantly increased in the EG (p < 0.01), but not in the TG and CG (p > 0.05). In this joint angle position the EG significantly differed from the TG and CG (p < 0.05). The results of the study show that athletic exercises with minimal footwear strengthen TFM after three weeks intensive training.

This study compared toe flexion strength in a control group, a traditional shoe group and a minimalist group. They claimed that the study showed that the minimalist group got stronger than the traditional shoe group and the control group. Note that the strength in the traditional running shoe group did not go down (it actually went up!), so we have some evidence that running shoes do not weaken muscles and strengthens them! (which supports the points I made above). The results in the paper claim that the minimalist group did get stronger, but the results are not as clear as stated as the authors used repeated measure t-tests (rather than an ANOVA) to do within groups comparisons which is not how you are supposed to analyze a randomized controlled trial. They should have done a between groups comparison (which is what the CONSORT statement and how every textbook on randomized controlled trials says you should do). Until we see the results of a between groups analysis the claims in the paper can not be verified as the minimalist group actually doing statistically significantly better, but they probably did. However, the study showed in the within groups analysis that the traditional running shoe group did not get weaker and got stronger which is evidence that contradicts all the unsupported claims that we see and hear on running shoes weakening muscles.

As always, I go where the evidence takes me until convinced otherwise, and the evidence tells me that running shoes do not weaken the muscles.

Goldmann, J., Potthast, W., & Brüggemann, G. (2013). Athletic training with minimal footwear strengthens toe flexor muscles Footwear Science, 5 (1), 19-25 DOI: 10.1080/19424280.2012.744361

Last updated by .

21 Responses to Do running shoes weaken muscles?

  1. Pete Larson March 24, 2013 at 1:56 pm #

    Interesting points Craig, and I generally agree. I think the whole debate could be summarized quite simply by considering simple training effects. I doubt anyone would argue with the statement that running will make you stronger than sitting on the couch, no matter what you put on your feet. Similarly, going to the gym and lifting weights will make you stronger than sitting on the couch. How and where those strength gains are made depends in the one case partly on what kind of shoes you wear (and surfaces, speeds, etc.), and in the other case which type of lifting exercises you do (bench press vs. curls vs. leg press, etc.).

    If I wear motion control when I run, that will work some muscles more than others, and if I run barefoot the targeted muscles will be a bit different. I’ll get stronger in both cases, just in a slightly different way. The issue is when you make a switch from one to the other without taking time for the training effect to kick in at a tissue level via adaptation. You could probably extend this same argument to the Vibram bone edema study. Pesonally, I like to mix things up so that I maximize the training effect to as many tissues as possible.

  2. Marc Schwartz March 25, 2013 at 3:10 pm #

    I am curious on your comments on the statistical method used and wanted to expand on that. I do not have access to the paper, as it is behind a paywall.

    First and foremost, if the three groups were randomized, why is the CG group notably smaller than the other two? Was there a non-random loss of 7 subjects in that group or was there an a priori design in which the CG group, as a control, had a lower randomization ratio than 1:1:1?

    Second, if I understand the design correctly, there are 3 groups and each had two measurements, one at baseline and one post training. If that is the case, then the use of ANOVA, as you suggest, would be on the differences between the two measurements for each subject within each group. There is the potential, that even with a randomized study, that there is an imbalance in the baseline measures across the groups. Randomization only maximizes the likelihood that any imbalance is due to chance, but does not ensure that there will not be an imbalance. If such an imbalance existed, it could bias the findings of the study, if for no other reason than regression to the mean.

    As a result, the use of ANCOVA would be preferred to account for any baseline imbalance. In this case, in a linear regression model, the post treatment measure would be the response (dependent) variable, the baseline measure would be an independent variable, as would the “treatment” group as a categorical covariate, with one group (eg CG) as the reference level.

    Presuming no interaction between the baseline measure and treatment group, the beta coefficients for the two other levels of treatment group, would represent the relative change for each of the other two groups from the reference group, thus has a very specific interpretation as the difference in response to treatment.

    The use of ANCOVA, would then allow for predictions to be made of the post training measures in the three groups at given baseline measures, making it easy to visualize the changes both over time and between groups, along with confidence intervals.

    Your reference to repeated measures t-tests, if I understand correctly, would infer that the authors performed 3 paired sample t-tests (using the differences between baseline and post training) as the data for each group. The null hypothesis would be that the mean differences beween baseline and post training is 0 within each group. That would seem consistent with the p values reported in the abstract, but in the end, is not the critical hypothesis.

    They then apparently performed multiple pairwise tests between each of the three possible pairings of the three groups (TC/EG, TC/CG and EG/CG) using the differences between baseline and post training as the data for each group. If so and they did not correct for multiple comparisons, that would bias the findings as well. Their p values would be too low, at some level, depending upon whether they chose to use a conservative correction like Bonferroni, or something less conservative like Hochberg.

    An alternative approach, depending upon their a priori hypothesis, would be to perform a Dunnett’s test, in which the TC and EG groups are each compared against the CG group. So you have two comparisons instead of three and Dunnett’s has a special adjustment in that setting for the multiple comparisons, yielding a more powerful test.

    Lastly, at least in the abstract, they only quote p values with no indication of the actual change in measurement from baseline to post training. Thus, there is no indication of the “effect size”. One can quote p values all day long. However, it is possible that one can have a statistically significant finding, while being clinically irrelevant, if the variance within each group is low enough. Was the change clinically relevant within any of the 3 groups and was the difference between the groups clinically relevant?

    There seems to be multiple problems here and again, lacking access to the full paper, it is not clear what the a priori powered hypothesis was (they had to pre-define the sample size against some formally stated null hypothesis). None of the authors is a biostatistician from what I can tell from a Google search and it is not clear if they were lacking access to one in the study design phase and did this all on their own or if there were other problems. Was there even an acknowledgement at the end of the paper to a biostatistician?

    To quote R.A. Fisher from 1938:

    “To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.”

  3. Craig March 25, 2013 at 11:36 pm #

    @Pete; thanks, which is exactly why I run in my NB Minimus’s and Hoka One Ones!. I had just grown tired of all the blanket statements that running shoes weaken muscles!

    @Marc; I have emailed you the full paper.
    I have written the script in ‘my head’ for a YT video I will record shortly on this issue of between and within groups analysis, to explain it more in lay terms as it is happening a lot lately.

    • Marc Schwartz March 26, 2013 at 3:37 pm #


      Thanks kindly for the paper. I have had a chance to review the paper in more detail and have some additional comments.

      Not being a clinician, I cannot comment on that aspect of things and I will defer to you and Pete on those points. However, I am a biostatistician, working in clinical trials, and there are additional problems with this study design and the analytic methods used.

      I still have an issue with the randomization approach and why there are less subjects in the CG group. There was something amiss in the implementation of randomization in this study. Either they lost 7 subjects in that group and are not reporting that, which would go against CONSORT and general regulatory guidance for human clinical trials (eg. “Intention to Treat” analyses), or they improperly implemented their randomization allocations. In a small study such as this (for me, this is a very small study), to have such an imbalance in sample size in one of the 3 arms is highly problematic and raises questions about the integrity of the study design.

      There is no indication of an a priori hypothesis and a power/sample size analysis, which would be mandatory in a randomized study such as this. Somehow, they had to calculate and justify a target sample size and there is no indication of the method in the paper. Lacking that, they failed to prospectively control the probability of Type I (false positive) and Type II (false negative) errors in the design. That essentially reduces the level of evidence in this study to a randomized observational study. In clinical trials, this type of study would potentially be considered a “pilot study”, in which the basic study design is implemented on a small set of subjects, where the logistics of the study are tested for problems (such as techniques resulting in obtaining inconsistent or poor measurements). In addition, the small set of data collected in a pilot study, in the absence of prior knowledge, would provide the basis for making assumptions about key data (means, variance, etc.) that would then be used to design a formally powered study on a larger sample.

      They indicated that they used a ‘repeated measures ANOVA’. However, that would be highly unusual in the case of only two (pre and post) measurements. They would have likely included both baseline and post-training measurements as the combined dependent variable with time and group as the independent variables, along with an error term to account for the within group and within subject variability. This would, in fact, allow for both between and within group analyses. However, due to the material imbalance in the sample size between groups, a key underlying assumption of this technique is violated, which is balance in the observations. That is, an equal number of measures in each sub-group.

      With an unbalanced design, they should have looked to use either the ANCOVA (regression) technique that I referenced in my prior reply, or consider a mixed effects linear regression model. However, there are really not enough replications of the measures for each subject (to estimate the random effects) to use that technique. Thus, it really leaves them with only one choice, which is the ANCOVA.

      It is interesting that they tested the data for normality a priori, which is really a red herring. There are multiple such tests, each placing more or less importance on the tails of the distribution. These days, the question is not whether the data are normal, but are the data ‘normal enough’ so as to not materially violate the assumptions of the tests you are performing. The typical approach would be to use a Q-Q (quantile-quantile) plot to visualize the distribution and use the eyeball test rather than a formal null hypothesis test. The problem with a formal null hypothesis tests is that in small samples, it can be difficult to establish non-normality and in large samples, trivial deviations will become significant. The alternative, if the data are indeed highly skewed, would be to consider using transformations on the data (eg. log) or consider using non-parametric techniques.

      They do provide estimates of within group (but not between group) effect sizes in the paper, using Cohen’s d (ranging from 0.1 to 0.6). This is a common measure. However, given the underlying design problems in the study, I would be hesitant to interpret these findings.

      There are multiple issues with the paper and there is not enough information in the paper to provide clarifications on their methods. One reasonably has to wonder about the voracity of the peer review process for this particular journal. I would say, in the absence of more information, they are lacking a biostatistician on their editorial review board.

      As a result of these issues, I would be highly skeptical in accepting the conclusions raised in the paper.

      • Craig March 26, 2013 at 10:09 pm #

        Thanks Marc!

        I missed the lack of ‘intention to treat’ analysis!

        The problem could be the nature of the journal (of which I am actually on the Editorial Board!). It is a biomechanics journal, so it could be that the reviewers were not that familiar with clinical trials methodology and the CONSORT statement.

        You might like to look at the study I mention here. They did not do an intention to treat analysis, which lead them to conclude the opposite of what the data probably showed!

    • Peter Larson March 26, 2013 at 5:29 pm #

      I just ran in Hokas the first time a few weeks ago, definitely an interesting shoe. Surprisingly easy to run in coming from much more minimal stuff.

  4. Rick Osler March 26, 2013 at 2:53 am #

    Thanks Craig (and Pete). This is something i have been prescribing for sometime…ie reducing ‘sameness’. Typically runners (including myself) will alternate 2-3 pairs of shoes…all different structure/stack height/gradient, which i believe improves resilience. Some will always use the same shoes and obtain variety by terrain changes…not always easy for the average runner in cities though.
    I do wonder however, whether the less conditioned office worker will benefit from a change in midsole structure (eg a Kayano wearer alternating with say a Brooks Ghost or a Sky Speed….ie no change in heel-toe gradient, but variable stack height) versus 2 shoes with different stack heights AND gradients. The former is ‘safer’ for less conditioned and poor ankle joint flexibility, the latter giving more variety, but without the correct adaptation time to a change in heel height differential, riskier.
    So as a Podiatrist working with elite sport AND the average weekend jogger, what i believe is better from experience and my readings (the latter prescription model above), has issues in terms of APPLICATION when it comes to sales in a retail environment.
    One almost needs to sell a MInimus with a half a dozen sessions with an exercise physiologist!

  5. sportinjurymatt March 27, 2013 at 3:00 am #

    Interesting article.
    I wonder if footwear that “controlls” natural foot motion can in certain cases be linked to the creation & promotion of muscle imbalances? If so, it could be argued that the footwear has “weakened” certain muscles relative to others? Muscle imbalance is often seen as precursor to injury and could have played a part in the 2010 studies that showed motion control trainers on “low arched” feet actually causing injury.
    This study was highlighted by Ian Griffiths in his article “Choosing Running Shoes: The Evidence Behind the Recommendations”(, namely: Ryan MB, Valiant GA, McDonald K, et al: “The effect of three different levels of footwear stability on pain outcomes in women runners: a randomised control trial.” British Journal of Sports Medicine (2010). doi: 10.1136/bjsm.2009.069849.
    As Ian points out, “every single runner who had been classified as having a ‘highly pronated’ foot type and was subsequently put into a motion control shoe reported an injury during a 13 week half marathon training programme.” Was this due to the inappropriate controlling nature of the trainer causing an upset in muscle balance, i.e. relative weakening of certain muscles? I don’t know. Just thought I’d throw it out there…

    • Craig March 27, 2013 at 3:17 am #

      That is not what that study found. They actually reported: “No significant effects were reported for the highly pronated foot”.

      However, that study has way too many flaws to take too seriously. I will add it to my list of what I need to write about!

      (I had lunch with Ian on Sunday!)

  6. Ian Griffiths March 27, 2013 at 12:09 pm #

    Hi guys,

    Just to clear up the above point – despite the conclusions being that there were “no significant effects reported for the highly pronated foot” they did state that this was limited by an inadequate sample size. To put this into perspective, following randomisation there were 7 individuals who were deemed to have “highly pronated” feet and were put into the motion control shoe [Nike Nucleus]. Two of these pulled out of the study due to running related pain and the other 5 all missed at least 1 day of training due to running related pain. This was the only group that had 100% of runners who reported this. Naturally there are other factors at play here, but certainly worthy of comment.

    It wasn’t possible to upload the paper here or even pictures, so I will upload the relevant screenshots of the pdf to my twitter feed/twitpics for those interested.

    Craig – how did you rate my wifes homemade Lasagne??

    • Mark Richard July 5, 2013 at 11:46 pm #


  7. sportinjurymatt March 27, 2013 at 2:03 pm #

    lol… sod the screenshots, share the lasagne! :-p

  8. Craig March 27, 2013 at 11:22 pm #

    Thanks Ian; the lasagne was awesome!

    I see you posted the images:

    Unfortunately that was quite a poorly analyzed study. I will do a more detailed appraisal of it in a post one day soon, mainly because the study got such a ‘free pass’ on most websites (mainly barefoot and minimalist websites) where they did not to any sort of critical appraisal of it and blindly accepted the results. It just don’t figure the extraordinary lengths that they are now going to dismiss the results from the recent Vibrams and bone stress study!

    Paradoxically and hypocritically that they want it both ways and they should be held accountable for this. All studies should be critically appraised and should be held to the same standards of appraisal, not different standards depending on if the study does or does not support your world view! This is largely why we have systematic reviews to get around these potential biases. I not sure that this Ryan et al would pass muster to make it into a systematic review (it might just scrape in).

    Out of interest,I just read this post on Steve Novella’s blog on Evidence Thresholds:

  9. Mark Richard June 1, 2013 at 10:38 pm #

    Change the shape of growing feet perhaps?

  10. Mark Richard February 11, 2014 at 11:26 pm #

    Shoes change the shape of feet.Fact!

    • Craig Payne February 12, 2014 at 2:25 am #

      Who is claiming they don’t? However, you are confusing correlation with causation.
      Care to explain how bunions and hallux valgus also develop in some who never wear shoes?
      Care to also explain why people make up the lie about running shoes weakening muscles, when the evidence says that they don’t?

      • Mark Richard February 12, 2014 at 8:10 am #

        I can’t believe you think modern shoes don’t change feet and of course people develop problems who have never worn shoes,I just think if people had better guidance in regards to shoes and when they didn’t have to wear them didnt there would be less foot problems.
        Shoes aren’t the problem Craig more the design of most shoes is the problem.
        For what it’s worth Craig I’ve work in shoe shops since the 8os.

        I wear shoes BTW Craig,no heel,wide toe box etc.

      • Mark Richard February 12, 2014 at 8:12 am #

        Isn’t a shape change in a foot a weakness?

        • Craig Payne February 12, 2014 at 9:21 am #

          Where did I say that modern shoes don’t change feet?

          What i did say is the changes that the fan boys keep attributing to footwear, also happen in those who have never worn shoes. How do you explain that? ….. correlation is not causation.

          • Mark Richard February 12, 2014 at 11:42 am #

            To the same level? Or numbers? Don’t be silly Craig!
            Are heels a good thing for people’s feet? From 1/2″ to 6″ No!
            Stop obsessing over ‘fanboys’ there a minority and are getting in the way of your critical thinking and your normally good judgement.

            They come into my shops all the time and I just don’t engage with their nonsense like you seem to do these days.

            Craig we need you to be part of the solution,not the problem!

  11. Rick Merriam (@RickMerriam) March 16, 2014 at 5:53 pm #

    “If a non runner starts running tomorrow in the most bulky motion controlling running shoe, surely their muscles are going to get stronger and not weaker? How is that shoe actually going to weaken the muscles if they are using the muscles more by running?”

    Craig – Strength is relative. So my first question to you is this: How are you determining the strength of an individual muscle that is contributing to the overall function of the human chain?

    [Example: The tibialis anterior is *super* important to our ability to run.

    The tibialis anterior is a HUGE contributor to preparing the landing gear prior to the lead foot making contact with the ground. And no matter where the foot makes its initial contact with the ground, the tibialis anterior is responsible for eccentrically lengthening to decelerate the foot as it comes into the ground.

    If you were to ask any runner to dorsiflex their foot at the ankle joint, and at the same time, invert the foot when the same side hip and knee are flexed in a supine position on a treatment table, you would most likely see limited motion in both directions. And if you didn’t see limited motion at those joints, you would see limited motion of the lower leg that should be rotating medially (internally) at the knee joint.

    Wherever there is a limitation in range of motion at a joint(s), the muscle that is responsible for the motion(s), is always going to be the culprit.

    To say it another way, does the runner/athlete have sequential strength?

    Whenever the answer comes back as NO in the open-chain, the runner will have inefficient motion in the closed-chain.

    Then, there will be a workaround in the form of compensation.

    According to any anatomy book, in the open-chain, the tibialis anterior muscle is responsible for dorsiflexion of the foot at the ankle joint, and inversion of the foot at a bunch of different joints throughout the foot.

    But yet, the runner that is NOT complaining of foot pain, and NOT struggling with any injuries, is not capable of generating enough internal force to do something so simple when asked to contract the tibialis anterior muscle.

    Note: I’m only referring to one of many muscles here. As you know, every muscle has to play its unique role, in order for us to run with efficiency. But in order to determine whether or not a muscle is capable of pulling a joint(s) to the end range of the motion in question, somebody (anybody?) has to be willing to ask the right questions. And then, be willing to look for the answers.

    The bottom line: Whenever (And wherever!?!) you have a muscle that can not pull the joints that it’s responsible for to the end range of the motion, you not only have a weak muscle, you also have a lack of stability at the joint(s) in question.

    A muscle that is not receiving the adequate amount of feedback from the central nervous system (CNS), will *never* get “stronger”. Never. Ever. Even when a non-runner suddenly decides to take up running. But any muscle that is capable of playing a similar role as the tibialis anterior, will try to pick up the slack for the under-performing muscle (i.e., Extensor Hallucis Longus).

    Note: If anything, the tibialis anterior will most certainly get weaker.

    So, again, what you’re claiming as “strength”, is relative.]

    The intrinsic (foot only) muscles and the extrinsic (foot and lower leg) muscles are not the *only* muscles that contribute to efficient motion of the foot, regardless of the shoe.

    To say the same thing in a slightly different way, running requires motion throughout the entire chain. It just so happens that the foot is being driven from the bottom up, and the top down, and the running shoe gets all of the attention.

    But the truth is, everybody and their uncle is putting the cart before the horse. And most of the people that have a voice in the conversation, don’t even know how the human chain functions when the foot is interacting with the ground.

    That being said, efficient function is about timing. And any running shoe and/or “custom” orthotic that does not allow the foot to move at the right time, or the right plane, *will* most certainly have a negative impact on performance.

Leave a Reply