Sunday, June 30, 2013

Cyclonic Activity persists in Arctic


Above image, edited from Naval Research Laboratory, shows that a large area has developed at the center of the Arctic Ocean with very thin ice, at some places down to virtually zero, i.e. open water.

This development is to a large extent caused by persistent cyclonic activity in the Arctic. The Arctic is warming up faster than anywhere else, and this is reducing the temperature difference between the Arctic and lower latitudes. As a result, the polar vortex and jet stream get distorted, resulting in extreme weather. This is graphically illustrated by the animation below, from the California Regional Weather Server.


Note: this animation is a 2.5 MB file that may take some time to fully load.
Credit: California Regional Weather Server

Related

Open Water In Areas Around North Pole - June 22, 2013
http://arctic-news.blogspot.com/2013/06/open-water-in-areas-around-north-pole.html

Thin Spots developing in Arctic Sea Ice - June 13, 2013
http://arctic-news.blogspot.com/2013/06/thin-spots-developing-in-arctic-sea-ice.html

Polar jet stream appears hugely deformed - December 20, 2012

Changes to Polar Vortex affect mile-deep ocean circulation patterns - September 24, 2012
http://arctic-news.blogspot.com/2012/09/changes-to-polar-vortex-affect-mile-deep-ocean-circulation-patterns.html

Diagram of Doom - August 28, 2012
http://arctic-news.blogspot.com/2012/08/diagram-of-doom.html

Opening further Doorways to Doom - August 28, 2012
http://arctic-news.blogspot.com/2012/08/opening-further-doorways-to-doom.html

Huge cyclone batters Arctic sea ice - August 11, 2012

Thursday, June 27, 2013

The Threat of Wildfires in the North

NASA/NOAA image based on Suomi NPP satellite data from April 2012 to April 2013, with grid added
A new map has been issued by NOAA/NASA. The map shows that most vegetation grows in two bands, i.e. the Tropical Band (between latitudes 15°N and 15°S) and the Northern Band in between 45°N and 75°N, i.e. in North America, Europe and Siberia. On above image, the map is roughly overlayed with a grid to indicate latitude and longitude co-ordinates.


Vegetation in the Northern Band extends beyond the Arctic Circle (latitude 66° 33′ 44″ or 66.5622°, in blue on above image from Arcticsystem.no) into the Arctic, covering sparsely-populated areas such in Siberia, Alaska and the northern parts of Canada and Scandinavia. Further into the Arctic, there are huge areas with bush and shrubland that have taken thousands of years to develop, and once burnt, it can take a long time for vegetation to return, due to the short growing season and harsh conditions in the Arctic.



Above map with soil carbon content further shows that the top 100 cm of soil in the northern circumpolar region furthermore contains huge amounts of carbon.

May 16 2013 Drought 90 days Arctic
Global warming increases the risk of wildfires. This is especially applicable to the Arctic, where temperatures have been rising faster than anywhere else on Earth. Anomalies can be very high in specific cases, as illustrated by the temperature map below. High temperatures and drought combine to increase the threat of wildfires (see above image showing drought severity).

June 25, 2013 from Wunderground.com - Moscow broke its more than 100-year-old record for the hottest June 27
Zyryanka, Siberia, recently recorded a high of 37.4°C (99.3°F), against normal high temperatures of 20°C to 21°C for this time of year. Heat wave conditions were also recorded in Alaska recently, with temperatures as high as 96°F (36°C).

On June 19, 2013, NASA captured this image of smoke from wildfires burning in western Alaska. The smoke was moving west over Norton Sound. (The center of the image is roughly 163° West and 62° North.) Red outlines indicate hot spots with unusually warm surface temperatures associated with fire. NASA image by Jeff Schmaltz, LANCE/EOSDIS Rapid Response. Caption by Adam Voiland. - also see this post with NASA satellite image of Alaska.
Siberian wildfires June 21, from RobertScribbler 
from methanetracker.org

Wildfires raged in Russia in 2010. Flames ravaged 1.25 million hectares (4,826 mi²) of land including 2,092 hectares of peat moor.

Damage from the fires is estimated to be $15 billion, in a report in the Guardian.

Cost of fire-fighting efforts and agricultural losses alone are estimated at over $2bn, reports Munich Re, adding that Moscow's inhabitants suffered under a dense cloud of smoke which enveloped the city. In addition to toxic gases, it also contained considerable amounts of particulate matter. Mortality increased significantly: the number of deaths in July and August was 56,000 higher than in the same months in 2009. 


[From: Abrupt Local Warming, May 16, 2012]

Wildfires in the North threaten to cause large emissions of greenhouse gases and soot, which can settle on snow and ice in the Arctic and the Himalayan Plateau, with the resulting albedo changes causing a lot more sunlight to be absorbed, instead of reflected as was the case earlier. This in turn adds to the problem. Additionally, rising temperatures in the Arctic threaten to cause release of huge amounts of methane from sediments below the Arctic Ocean. This situation threatens to escalate into runway global warming in a matter of years, as illustrated by the image below.

How much will temperatures rise?
In conclusion, the risk is unacceptable and calls for a comprehensive and effective action plan that executes multiple lines of action in parallel, such as the 3-part Climate Action Plan below. Part 1 calls for a sustainable economy, i.e. dramatic reductions of pollutants on land, in oceans and in the atmosphere. Part 2 calls for heat management. Part 3 calls for methane management and further measures.


The Climate Action Plan set out in above diagram can be initiated immediately in any country, without the need for an international agreement to be reached first. This can avoid delays associated with complicated negotiations and on-going verification of implementation and progress in other nations.

In nations with both federal and state governments, such as the United States of America, the Climate Action Plan could be implemented as follows:
  • The President directs federal departments and agencies to reduce their emissions for each type of pollutant annually by a set percentage, say, CO2 and CH4 by 10%, and HFCs, N2O and soot by higher percentages.
  • The President demands states to each make the same cuts. 
  • The President directs the federal Environmental Protection Agency (EPA) to monitor implementation of states and to act step in where a state looks set to fail to miss one or more targets, by imposing (federal) fees on applicable polluting products sold in the respective state, with revenues used for federal benefits.
Such federal benefits could include building interstate High-Speed Rail tracks, adaptation and conservation measures, management of national parks, R&D into batteries, ways to vegetate deserts and other land use measurements, all at the discretion of the EPA. The fees can be roughly calculated as the average of fees that other states impose in successful efforts to meet their targets.

This way, the decision how to reduce targets is largely delegated to state level, while states can similarly delegate decisions to local communities. While feebates, preferably implemented locally, are recommended as the most effective way to reach targets, each state and even each local community can largely decide how to implement things, provided that each of the targets are reached.

Similar targets could be adopted elsewhere in the world, and each nation could similarly delegate responsibilities to local communities. Additionally, it makes sense to agree internationally to impose extra fees on international commercial aviation, with revenues used to develop ways to cool the Arctic.

- Climate Plan

Saturday, June 22, 2013

Open Water In Areas Around North Pole

In some areas around the North Pole, thickness of the sea ice has declined to virtually zero, i.e. open water.


What could have caused this open water? Let's go through some of the background.

North Hemisphere snow cover has been low for some time. Snow cover in May 2013 was the lowest on record for Eurasia. There now is very little snow left, as shown on the image right, adapted from the National Ice Center.

Low snow cover is causing more sunlight to be absorbed, rather than reflected back into space. As can be expected, there now are high surface temperatures in many areas, as illustrated by the NOAA image below. Anomalies can be very high in specific cases. Zyryanka, Siberia, recently recorded a high of 37.4 C, against normal high temperatures of 20 C to 21 C for this time of year. Heat wave conditions were also recorded in Alaska recently (satellite image of Alaska below).

NASA image June 17, 2013, credit: NASA/Jeff Schmaltz, LANCE MODIS Rapid Response Team, NASA GSFC - from caption by Adam Voiland: "Talkeetna, a town about 100 miles north of Anchorage, saw temperatures reach 96°F (36°C) on June 17. Other towns in southern Alaska set all-time record highs, including Cordova, Valez, and Seward. The high temperatures also helped fuel wildfires and hastened the breakup of sea ice in the Chukchi Sea."
Accordingly, a large amount of relatively warm water from rivers has flowed into the Arctic Ocean, in addition to warm water from the Atlantic and Pacific Oceans.


Sea surface temperatures have been anomalously high in many places around the edges of the sea ice, as also shown on the NOAA image below.


Nonetheless, as the above images also make clear, sea surface temperatures closer to the North Pole have until now remained at or below zero degrees Celsius, with sea ice cover appearing to remain in place. The webcam below from the North Pole Environmental Observatory shows that there still is a lot of ice, at least in some parts around the North Pole.

Webcam #2 of the North Pole Environmental Observatory monitoring UPMC's Atmospheric Buoy, June 21, 2013
So, what could have caused the sea ice to experience such a dramatic thickness decline in some areas close to the North Pole?

Firstly, as discussed in earlier posts, there has been strong cyclonic activity over the Arctic Ocean (see also Arctic Sea Ice blog post). This has made the sea ice more prone and vulnerable to the rapid decline that is now taking place in many areas.

Furthermore, Arctic sea ice thickness is very low, as illustrated by the image below.

Arctic sea ice volume/extent ratio, adapted by Sam Carana from an image by Neven (click to enlarge)
Finally, there has been a lot of sunshine at the North Pole. At this time of year, insolation in the Arctic is at its highest. Solstice (June 20 or June 21, 2013, depending on time zone) is the day when the Arctic receives the most hours of sunlight, as Earth reaches its maximum axial tilt toward the sun of 23° 26'. In fact, insolation during the months June and July is higher in the Arctic than anywhere else on Earth, as shown on the image below.

Monthly insolation for selected latitudes -  adapted from Pidwirny, M. (2006), in "Earth-Sun Relationships and Insolation",  Fundamentals of Physical Geography, 2nd Edition
In conclusion, the current rapid sea ice thickness decline close to the North Pole is mostly due to a combination of earlier cyclonic activity and lots of sunlight, while the sea ice was already very thin to start with. The cyclone broke up the sea ice at the center of the Arctic Ocean, which is turn made it more prone to melting rapidly. The cyclone did more, though, as contributor to the Arctic-news blog Veli Albert Kallio explains:
"The ocean surface freezes if the temperature falls below -2.5C. The reason for the negative melting point is the presence of 4-5% of sea salt. Only in the polar regions does the sea surface cool sufficiently for sea ice to form during winters.

The sea ice cover is currently thinning near the North Pole between 80-90 degrees north. This part of the ocean is very deep. It receives heat of the Gulf Stream from the south: as the warm water vapourises, its salt content to water increases. This densifies the Gulf Stream which then falls onto the sea floor where it dissipates its heat to the overlying water column. The deep basin of the Arctic Ocean is now getting sufficiently warmed for the thin sea ice cover to thin on top of it. The transportation of heat to the icy surface is combined with the winds that push cold surface water down while rising heat to surface."
Indeed, vertical mixing of the water column was enhanced due to cyclonic activity, and this occurred especially in the parts of the Arctic Ocean that also are the deepest, as illustrated by the animation below.
Legend right: Ice thickness in m from Naval Research Laboratory
Legend bottom: Sea depth (blue) and land height (brown/green)
in m from NIBCAO Arctic map at NOAA
The compilation of images below shows how the decline of sea ice has taken place in a matter of weeks.

[ click to enlarge ]
This spells bad news for the future. It confirms earlier analyses (see links below) that the sea ice will disappear altogether within years. It shows that the sea ice is capable of breaking up abruptly, not only at the outer edges, but also at the center of the Arctic Ocean. As the Arctic sea ice keeps declining in thickness, it does indeed look set to break up and disappear abruptly across most of the Arctic Ocean within a few years. Models that are based on sea ice merely shrinking slowly from the outer edges inward should reconsider their projections accordingly.

Related

- Getting the Picture
http://arctic-news.blogspot.com/2012/08/getting-the-picture.html

- Supplementary evidence by Prof. Peter Wadhams
http://arctic-news.blogspot.com/2012/04/supplementary-evidence-by-prof-peter.html

Thursday, June 20, 2013

Discussion meeting vs conference: in praise of slower science




Pompeii mosaic

Plato conversing with his students



As time goes by, I am increasingly unable to enjoy big conferences. I'm not sure how much it's a change in me or a change in conferences, but my attention span shrivels after the first few talks. I don't think I'm alone. Look around any conference hall and everywhere you'll see people checking their email or texting. I usually end up thinking I'd be better off staying at home and just reading stuff.



All this made me start to wonder, what is the point of conferences?  Interaction should be the key thing that a conference can deliver. I have in the past worked in small departments, grotting away on my own without a single colleague who is interested in what I'm doing. In that situation, a conference can reinvigorate your interest in the field, by providing contact with like-minded people who share your particular obsession. And for early-career academics, it can be fascinating to see the big names in action. For me, some of the most memorable and informative experiences at conferences came in the discussion period. If X suggested an alternative interpretation of Y's data, how did Y respond: with good arguments or with evasive arrogance? And how about the time that Z noted important links between the findings of X and Y that nobody had previously been aware of, and the germ of an idea for a new experiment was born?



I think my growing disaffection with conferences is partly fuelled by a decline in the amount and standard of discussion at such events. There's always a lot to squeeze in, speakers will often over-run their allocated time, and in large meetings, meaningful discussion is hampered by the acoustic limitations of large auditoriums. And there's a psychological element too: many people dislike public discussion, and are reluctant to ask questions for fear of seeming rude or self-promotional (see comments on this blogpost for examples). Important debate between those doing cutting-edge work may take place at the conference, but it's more likely to involve a small group over dinner than those in the academic sessions.



Last week, the Royal Society provided the chance for me, together with Karalyn Patterson and Kate Nation, to try a couple of different formats that aimed to restore the role of discussion in academic meetings. Our goal was to bring together researchers from two fields that were related but seldom made contact: acquired and developmental language disorders. Methods and theories in these areas have evolved quite separately, even though the phenomena they deal with overlap substantially.



The Royal Society asks for meeting proposals twice a year, and we were amazed when they not only approved our proposal, but suggested we should have both a Discussion Meeting at the Royal Society in London, and a smaller Satellite meeting at their conference centre at Chicheley Hall in the Buckinghamshire countryside.



We wanted to stimulate discussion, but were aware that if we just had a series of talks by speakers from the two areas, they would probably continue as parallel, non-overlapping streams. So we gave them explicit instructions to interact. For the Discussion meeting, we paired up speakers who worked on similar topics with adults or children, and encouraged them to share their paper with their "buddy" before the meeting. They were asked to devote the last 5-10 minutes of their talk to considering the implications of their buddy's work for their own area. We clearly invited the right people, because the speakers rose to this challenge magnificently. They also were remarkable in all keeping to their allotted 30 minutes, allowing adequate time for discussion. And the discussion really did work: people seemed genuinely fired up to talk about the implications of the work, and the links between speakers, rather than scoring points off each other.



After two days in London, a smaller group of us, feeling rather like a school party, were wafted off to Chicheley in a special Royal Society bus. Here we were going to be even more experimental in our format. We wanted to focus more on early-career scientists, and thanks to generous funding from the Experimental Psychology Society, we were able to include a group of postgrads and postdocs. The programme for the meeting was completely open-ended. Apart from a scheduled poster session, giving the younger people a chance to present their work, we planned two full days of nothing but discussion. Session 1 was the only one with a clear agenda: it was devoted to deciding what we wanted to talk about.



We were pretty nervous about this: it could have been a disaster. What if everyone ran out of things to say and got bored? What if one or two loud-mouths dominated the discussion? Or maybe most people would retire to their rooms and look at email. In fact, the feedback we've had concurs with our own impressions that it worked brilliantly. There were a few things that helped make it a success.


  • The setting, provided by the Royal Society, was perfect. Chicheley Hall is a beautiful stately home in the middle of nowhere. There were no distractions, and no chance of popping out to do a bit of shopping. The meeting spaces were far more conducive to discussion than a traditional lecture theatre.

  • The topic, looking for shared points of interest in two different research fields, encouraged a collaborative spirit, rather than competition.

  • The people were the right mix. We'd thought quite carefully about who to invite; we'd gone for senior people whose natural talkativeness was powered by enthusiasm rather than self-importance. People had complementary areas of expertise, and everyone, however senior, came away feeling they'd learned something.

  • Early-career scientists were selected from those applying, on the basis that their supervisor indicated they had the skills to participate fully in the experience. Nine of them were selected as rapporteurs, and were required to take notes in a break-out session, and then condense 90 minutes of discussion into a 15-minute summary for the whole group.  All nine were quite simply magnificent in this role, and surpassed our expectations. The idea of rapporteurs was, by the way, stimulated by experience at Dahlem conferences, which pioneered discussion-based meetings, and subsequent Strüngmann forums, which continue the tradition.

  • Kate Nation noted that at the London meeting, the discussion had been lively and enjoyable, but largely excluded younger scientists. She suggested that for our discussions at Chicheley, nobody over the age of 40 should be allowed to talk for the first 10 minutes. The Nation Rule proved highly effective - occasionally broken, but greatly appreciated by several of the early career scientists, who told us that they would not have spoken out so much without this encouragement.


I was intrigued to hear from Uta Frith that there is a Slow Science movement, and I felt the whole experience fitted with their ethos: encouraging people to think about science rather than frenetically rushing on to the next thing. Commentary on this has focused mainly on the day-to-day activities of scientists and publication practices (Lutz, 2012). I haven't seen anything specifically about conferences from the Slow Science movement (and since they seem uninterested in social media, it's hard to find out much about them!), but I hope that we'll see more meetings like this, where we all have time to pause, ponder and discuss ideas.

 



Reference

Lutz, J. (2012). Slow science Nature Chemistry, 4 (8), 588-589 DOI: 10.1038/nchem.1415

Extreme weather becomes the norm - what can you do?

. . a sky that has turned red due to greenhouse gases, while the land is flooded. The handful of
people who survived are standing by helplessly on higher grounds, in despair and without hope,
while one figure turns to me in panic and pain, uttering nothing but a silent scream . . .
(comment by Sam Carana, March 8, 2012, on auction of the Scream, by Edvard Munch)

Symptoms

Torrential rains in some regions are causing massive floods while in other locales record droughts are occurring with higher frequency and severity and areal extent around the globe. Global food production is being hit hard, leading to large price increases and political instability. Areas under drought are experiencing numerous massive forest fires of incredible ferocity.

Causes

The statistics of extreme weather events have changed for the worst due to changes in the location, speed, and waviness of the jet streams which guide weather patterns and separate cold and dry northern air from warm and moist southern air. The jet streams have changed since the equator to north-pole temperature difference has decreased due to the huge temperature rise in the Arctic.

The huge temperature rise in the Arctic is due to a collapse in the area of highly reflective snow and ice, which is caused by melting. The melting is from warming from the increase of greenhouse gases from fossil fuel burning. The Arctic sea ice and spring snow cover will vanish within a few years and the weather extremes will increase at least 10x.

What can you do?

Go talk to you politicians and friends about climate change and the need to slash fossil fuel emissions. Immediately. Cut and paste my comments above and post them on facebook, send them to newspapers, and educate yourself on the science behind all the above linkages. Leave my name on or take it off and plagiarize all you want, just get this knowledge out there...

From an unmuzzled climate scientist...
Paul Beckwith, B.Eng, M.Sc. (Physics),
Ph.D. student (Climatology) and Part-time Professor,
University of Ottawa

originally posted as a comment under the CBCnews post:
Calgary braces for flooding, orders communities evacuated 

Related

- The Tornado Connection to Climate Change
- President Obama, here's a climate plan!
- Diagram of Doom
- Polar jet stream appears hugely deformed
Ten Dangers of Global Warming (originally posted March 8, 2007)

Tuesday, June 18, 2013

Mean Methane Levels reach 1800 ppb

On May 9, the daily mean concentration of carbon dioxide in the atmosphere of Mauna Loa, Hawaii, surpassed 400 parts per million (ppm) for the first time since measurements began in 1958. This is 120 ppm higher than pre-industrial peak levels. This unfortunate milestone was widely reported in the media.

There's another milestone that looks even more threatening than the above one. On the morning of June 16, 2013, methane levels reached an average mean of 1800 parts per billion (ppb). This is more than 1100 ppb higher than levels reached in pre-industrial times (see graph further below).
NOAA image
Vostok ice core analysis shows that temperatures and levels of carbon dioxide and methane have all moved within narrow bands while remaining in sync with each other over the past 400,000 years. Carbon dioxide moved within a band with lower and upper boundaries of respectively 200 and 280 ppm. Methane moved within lower and upper boundaries of respectively 400 and 800 ppb.
Temperatures moved within lower and upper boundaries of respectively -8 and 2 degrees Celsius.

From a historic perspective, greenhouse gas levels have risen abruptly to unprecedented levels. While already at a historic peak, humans have caused emissions of additional greenhouse gases. There's no doubt that such greenhouse gas levels will lead to huge rises in temperatures. The question is how long it will take for temperatures to catch up and rise.


Below is another way of looking at the hockey stick. And of course, further emissions could be added as well, such as nitrous oxide and soot.



Large releases of methane must have taken place numerous times in history, as evidenced by numerous pockmarks, as large as 11 km (6.8 mi) wide.

Importantly, large methane releases in the past did not result in runaway global warming for a number of reasons:
  • methane release typically took place gradually over many years, each time allowing a large release of methane to be broken down naturally over the years before another one occurred. 
  • Where high levels of methane in the atmosphere persisted and caused a lot of heat to be trapped, this heat could still be coped with due to greater presence of ice acting as a buffer and consuming the heat before it could escalate into runaway temperature rises.
Wikipedia image
Veli Albert Kallio comments:

The problem with ice cores is that if there is too sudden methane surge, then the climate warms very rapidly. This then results the glacier surfaces melting away and the ice core begins to loose regressively surface data if there is too much methane in the air.

Because of this, there has been previous occurrences of high methane, and these were instrumental to bring the ice ages ice sheets to end (Euan Nisbet's Royal Society paper). The key to this is to look at some key anomalies and devise the right experiments to test the hypothesis for methane eruptions as the period to ice ages.

Thus, the current methane melting and 1800 ppm rise is nothing new except that there are no huge Pleistocene glaciers to cool the Arctic Ocean if methane goes to overdrive this time. In fact methane may have been many times higher than that but all surface ice kept melting away and staying regressive until cold water and ice from destabilised ice sheets stopped the supply of methane (it decays fast if supply is cut and temperatures fall back rapidly when seas rose).

The Laurentide Ice Sheet alone was equivalent of 25 Greenland Ice Sheets and the Weischelian and other sheets on top of that. So, the glaciers do not act the same way as fireman to extinguish methane. Runaway global warming is now possibility if the Arctic loses its methane holding capability due to warming.

Further discussion is invited on the following points:
  • The large carbon-12 emission anomalies in East Asian historical objects that are dateable by historical knowledge. Discussion about the explanations concocted and why methane emission from permafrost soils and sea beds must be the answer; 
  • the much overlooked fact that if there were ever very highly elevated concentrations of air in the Arctic, this would induce strong melting of glaciers which then lack those surface depositions where the air were most CH4 and CO2 laden. Even moderate levels of temperature rise damaged Larsen A, Larsen B, Petermann and Ellesmere glaciers. If huge runaway outgassing came out when Beringia flipped into soil warming, then methane came out really large amounts with CO2.
  • Discussion of the experiments how to compensate for the possible lack of "time" in methane elevated periods in the ice cores by alternative experiments to obtain daily, weekly, monthly and yearly emission rates of CH4 and CO2 from the Last Glacial Maximum to the Holocene Thermal Maximum (as daily, weekly, monthly, and yearly sampling of air).

Editor's update: Methane levels go up and down with the seasons, and differ by altitude. As above post shows, mean levels reached 1800 ppb in May 2013 at 586 mb, according to MetOp-2 data. Note that IPCC AR5 gives levels of 1798 ppb in 2010 and 1803 ppb in 2011, as further discussed in later posts such as this one. Also, see historic data as supplied by NOAA below.




Monday, June 17, 2013

Research fraud: More scrutiny by administrators is not the answer

I read this piece in the Independent this morning and an icy chill gripped me. Fraudulent researchers have been damaging Britain's scientific reputation and we need to do something. But what? Sadly, it sounds like the plan is to do what is usually done when a moral panic occurs: increase the amount of regulation.



So here is my, very quick, response – I really have lots of other things I should be doing, but this seemed urgent, so apologies for typos etc.



According to the account in the Independent, Universities will not be eligible for research funding unless they sign up to a Concordat for Research Integrity which entails, among other things, that they "will have to demonstrate annually that each team member’s graphs and spreadsheets are precisely correct."



We already have massive regulation around the ethics of research on human participants that works on the assumption that nobody can be trusted, so we all have to do mountains of paperwork to prove we aren't doing anything deceptive or harmful. 



So, you will ask, am I in favour of fraud and sloppiness in research? Of course not. Indeed, I devote a fair part of my blog to criticisms of what I see as dodgy science: typically, not outright fraud, but rather over-hyped or methodologically weak work, which is, to my mind, a far greater problem. I agree we need to think about how to fix science, and that many of our current practices lead to non-replicable findings. I just don't think more scrutiny by administrators is the solution. To start scrutinising datasets is just silly: this is not where the problem lies.



So what would I do? The answers fall into three main categories: incentives, publication practices, and research methods.



Incentives is the big one. I've been arguing for years that our current reward system distorts and damages science. I won't rehearse the arguments again: you can read them here. 
The current Research Excellence Framework is, to my mind, an unnecessary exercise that further incentivizes researchers against doing slow and careful work. My first recommendation is therefore that we ditch the REF and use simpler metrics to allocate research funding to University, freeing up a great deal of time and money, and improving the security of research staff. Currently, we have a situation where research stardom, assessed by REF criteria, is all-important. Instead of valuing papers in top journals, we should be valuing research replicability



Publication practices are problematic, mainly because the top journals prioritize exciting results over methodological rigour. There is therefore a strong temptation to do post hoc analyses of data until an exciting result emerges. Pre-registration of research projects has been recommended as a way of dealing with this - see this letter to the Guardian on which I am a signatory. 
It might be even more effective if research funders adopted the practice of requiring researchers to specify the details of their methods and analyses in advance on a publicly-available database. And once the research was done, the publication should contain a link to a site where data are openly available for scrutiny – with appropriate safeguards about conditions for re-use.



As regards research methods, we need better training of scientists to become more aware of the limitations of the methods that they use. Too often statistical training is a dry and inaccessible discipline. All scientists should be taught how to generate random datasets: nothing is quite as good at instilling a proper understanding of p-values as seeing the apparent patterns in data that will inevitably arise if you look hard enough at some random numbers. In addition, not enough researchers receive training in best practices for ensuring quality of data entry, or in exploratory data analysis to check the numbers are coherent and meet assumptions of the analytic approach.



In my original post on expansion of regulators, I suggested that before a new regulation is introduced, there should be a cold-blooded cost-benefit analysis that considers, among other things, the cost of the regulation both in terms of the salaries of people who implement it, and the time and other costs to those affected by it. My concern is that among the 'other costs' is something rather nebulous that could easily get missed. Quite simply, doing good research takes time and mental space of the researchers. Most researchers are geeks who like nothing better than staring at data and thinking about complicated problems. If you require them to spend time satisfying bureaucratic requirements, this saps the spirit and reduces creativity.



I think we can learn much from the way ethics regulations have panned out. When a new system was first introduced in response to the Alder Hey scandal, I'm sure many thought it was a good idea. It has taken several years for the full impact to be appreciated. The problems are documented in a report by the Academy of Medical Sciences, which noted "Urgent changes are required to the regulation and governance of health
research in the UK because unnecessary delays, bureaucracy and
complexity are stifling medical advances, without additional benefits to
patient safety
"



If the account in the Independent is to be believed, then the Concordat for Research Integrity could lead to a similar outcome. I'm glad I will retire before the it is fully implemented.

Sunday, June 16, 2013

Arctic Sea Ice September 2013 Projections

What will the Arctic Sea Ice look like in September 2013?

Several projections for Arctic sea ice extent are being discussed at places such as ARCUS (Arctic Research Consortium of the United States) and the Arctic Sea Ice Blog. The image below, from ARCUS, shows various projections of September 2013 arctic sea extent (defined as the monthly average for September) with a median value of 4.1 million square kilometers, with quartiles of 3.8 and 4.4 million square kilometers.


Note that sea ice extent in the above projections is defined as area of ocean with at least 15% ice, in line with the way the NSIDC calculates extent. By contrast, the Danish Meteorological Institute includes areas with ice concentration higher than 30% to calculate ice extent.

Rather than looking at the projected average for September, one could also project the minimum value for September 2013. And rather than looking at sea ice extent, one could also look at sea ice area, which differs from sea ice extent as the NSIDC FAQ page describes:
A simplified way to think of extent versus area is to imagine a slice of Swiss cheese. Extent would be a measure of the edges of the slice of cheese and all of the space inside it. Area would be the measure of where there is cheese only, not including the holes. That is why if you compare extent and area in the same time period, extent is always bigger.


Above image shows Sam Carana's projected minimum area of 2 million square km for 2013, based on data by Cryosphere Today and on numerous factors, such as continued warming of the water underneath the ice, stronger cyclones, etc.
Roughly in line with above image, by Wipneus, Sam Carana's projection for Arctic sea ice minimum volume is 2,000 cubic km in September 2013.

Readers are invited to submit comments below with further projections.

Overhyped genetic findings: the case of dyslexia

A press release by Yale University Press Office was recently recycled on the Research Blogging website*, announcing that their researchers had made a major breakthrough. Specifically they said "A new study of the genetic origins of dyslexia and other learning disabilities could allow for earlier diagnoses and more successful interventions, according to researchers at Yale School of Medicine. Many students now are not diagnosed until high school, at which point treatments are less effective." The breathless account by the Press Office is hard to square with the abstract of the paper, which makes no mention of early diagnosis or intervention, but rather focuses on characterising a putative functional risk variant in the DCDC2 gene, named READ1, and establishing its association with reading and language skills.



I've discussed why this kind of thing is problematic in a previous blogpost, but perhaps a figure will help. The point is that in a large sample you can have a statistically strong association between a condition such as dyslexia and a genetic variant, but this does not mean that you can predict who will be dyslexic from their genes.






Proportions with risk variants estimated from Scerri et al (2011)

In this example, based on one of the best-replicated associations in the literature, you can see that most people with dyslexia don't have the risk version of the gene, and most people with the risk version of the gene don't have dyslexia. The effect sizes of individual genetic variants can be very small even when the strength of genetic association is large.



So what about the results from the latest Yale press release? Do they allow for more accurate identification of dyslexia on the basis of genes? In a word, no. I was pleased to see that the authors reported the effect sizes associated with the key genetic variants, which makes it relatively easy to estimate their usefulness in screening. In addition to identifying two sequences in DCDC2 associated with risk of language or reading problems, the authors noted an interaction with a risk version of another gene, KIAA0319, such that children with risk versions in both genes were particularly likely to have problems.  The relevant figure is shown here.






Fig 3A from Powers et al (2013)



There are several points to note from this plot, bearing in mind that dyslexia or SLI would normally only be diagnosed if a child's reading or language scores were at least 1.0 SD below average.


  1. For children who have either KIAA0319 or DCDC2 risk variants, but not both, the average score on reading and language measures is at most 0.1 SD below average.

  2. For those who have both risk factors together, some tests give scores that are 0.3 SD below average, but this is only a subset of the reading/language measures. On nonword reading, often used as a diagnostic test for dyslexia, there is no evidence of any deficit in those with both risk versions of the genes. On the two language measures, the deficit hovers around 0.15 SD below the mean.

  3. The tests that show the largest deficits in those with two risk factors are measures of IQ rather than reading or language. Even here, the degree of impairment in those with two risk factors together indicates that the majority of children with this genotype would not fall in the impaired range.

  4. The number of children with the two risk factors together is very small, around 1% of the population.


In sum, I think this is an interesting paper that might help us discover more about how genetic variation works to influence cognitive development by affecting brain function. The authors present the data in a way that allows us to appraise the clinical significance of the findings quite easily. However, the results indicate that, far from indicating translational potential for diagnosis and treatment, genetic effects are subtle and unlikely to be useful for this purpose.



*It is unclear to me whether the Yale University Press Office are actively involved in gatecrashing Research Blogging, or whether this is just an independent 'blogger' who is recycling press releases as if they are blogposts.



Reference

Powers, N., Eicher, J., Butter, F., Kong, Y., Miller, L., Ring, S., Mann, M., & Gruen, J. (2013). Alleles of a Polymorphic ETV6 Binding Site in DCDC2 Confer Risk of Reading and Language Impairment The American Journal of Human Genetics DOI: 10.1016/j.ajhg.2013.05.008

Scerri, T. S., Morris, A. P., Buckingham, L. L., Newbury, D. F., Miller, L. L., Monaco, A. P., . . . Paracchini, S. (2011). DCDC2, KIAA0319 and CMIP are associated with reading-related traits. Biological Psychiatry, 70, 237-245. doi: 10.1016/j.biopsych.2011.02.005
 

Friday, June 7, 2013

Interpreting unexpected significant results




©www.cartoonstock.com

Here's s question for
researchers who use analysis of variance (ANOVA). Suppose I set up a study to
see if one group (e.g. men) differs from another (women) on brain response
to auditory stimuli (e.g. standard sounds vs deviant sounds – a classic
mismatch negativity paradigm). I measure
the brain response at frontal and central electrodes located on two sides of the head. The nerds among my readers will see that I have
here a four-way ANOVA, with one between-subjects factor (sex) and three
within-subjects factors (stimulus, hemisphere, electrode location). My
hypothesis is that women have bigger mismatch effects than men, so I predict an
interaction between sex and stimulus, but the only result significant at p <
.05 is a three-way interaction between sex, stimulus and electrode location. What should I do?




a) Describe this as my
main effect of interest, revising my hypothesis to argue for a site-specific
sex effect


b) Describe the result as
an exploratory finding in need of replication


c) Ignore the result as
it was not predicted and is likely to be a false positive



I'd love to do a survey
to see how people respond to these choices; my guess is many would opt for a)
and few would opt for c). Yet in this situation, the likelihood of the result
being a false positive is very high – much higher than many people realise.   


Many people assume that
if an ANOVA output is significant at the .05 level, there's only a one in
twenty chance of it being a spurious chance effect. We have been taught that we do ANOVA rather
than numerous t-tests because ANOVA adjusts for multiple comparisons. But this
interpretation is quite wrong. ANOVA adjusts for the number of levels within a factor, so, for instance, the probability
of finding a significant effect of group is the same regardless of how many
groups you have. ANOVA makes no
adjustment to p-values for the number of factors and interactions in your
design. The more of these you have, the greater the chance of turning up a
"significant" result.


So, for the example given
above, the probability of finding something
significant at .05, is as follows:



For the four-way ANOVA
example above, we have 15 terms (four
main effects, six 2-way interactions, four 3-way interactions and one 4-way
interaction) and the probability of finding no significant effect is .95^15 =
.46. It follows that the probability of finding something significant is .54.


And for a three-way ANOVA
there are seven terms (three main effects, three 2-way interactions and one
3-way interaction), and p (something significant) = .30.


So, basically, if you do
a four-way ANOVA, and you don't care what results comes out, provided something
is significant, you have a slightly greater than 50% chance of being satisfied. This might seem like an
implausible example: after all who uses ANOVA like this? Well, unfortunately,
this example corresponds rather closely to what often happens in
electrophysiological research using event-related potentials (ERPs). In this field, the interest is often in
comparing a clinical and a control group, and so some results are more
interesting than others: the main effect of group, and the seven interactions
with group are the principal focus of attention. But hypotheses about exactly what will be
found are seldom clearcut: excitement is generated by any p-value associated
with a group term that falls below .05. There's a one in three chance that one
of the terms involving group will have a p-value this low. This means that the
potential for 'false positive psychology' in this field is enormous (Simmons et
al, 2011).

A corollary
of this is that researchers can modify the likelihood of finding a
"significant" result by selecting one ANOVA design rather than
another. Suppose I'm interested in comparing brain responses to standard and
deviant sounds. One way of doing this is to compute the difference between ERPs
to the two auditory stimuli and use this difference score as the dependent
variable:  this reduces my ANOVA from a
4-way to a 3-way design, and gives fewer opportunities for spurious findings. So
you will get a different risk of a false positive,
depending on how you analyse the data.



Another feature of ERP
research is that there is flexibility in how electrodes are handled in an ANOVA
design: since there is symmetry in electrode placement, it is not uncommon to
treat hemisphere as one factor, and electrode site as another. The alternative
is just to treat electrode as a repeated measure. This is not a neutral choice:
the chances of spurious findings is greater if one adopts the first approach,
simply because it adds a factor to the analysis, plus all the interactions with
that factor.




I stumbled across these
insights into ANOVA when I was simulating data using a design adopted in a
recent PLOS One paper that I'd commented on. I was initially interested in looking at the
impact of adopting an unbalanced design in ANOVA: this study had a group factor
with sample sizes of 20, 12 and 12. Unbalanced designs are known to be problematic for repeated measures ANOVA and I initially thought this might be
the reason why simulated random numbers were giving such a lot of
"significant" p-values. However, when I modified the simulation to
use equal sample sizes across groups, the analysis continued to generate far
more low p-values than I had anticipated, and I eventually twigged that this
was because this is what you get if you use 4-way ANOVA. For any one main
effect or interaction, the probability of p < .05 was one in twenty: but the
probability that at least one term in the analysis would give p < .05 was closer
to 50%.


The analytic approach
adopted in the PLOS One paper is pretty standard in the field of ERP. Indeed, I have
seen papers where 5-way or even 6-way
repeated measures ANOVA is used. When
you do an ANOVA and it spews out the results, it's tempting to home in on the
results that achieve the magical significance level of .05 and then formulate
some kind of explanation for the findings. Alas, this is an approach that has
left the field swamped by spurious results.


There have been various
critiques of analytic methods in ERP, but I haven't yet found any that have
focussed on this point. Kilner (2013) has noted the bias that arises when
electrodes or windows are selected for analysis post hoc, on the basis that
they give big effects. Others have noted problems with using electrode as a repeated measure, given that ERPs at different electrodes are often highly
correlated. More generally,
statisticians are urging psychologists to move away from using ANOVA to adopt multi-level modelling, which makes different assumptions and can cope, for
instance, with unbalanced designs. However, we're not going to fix the problem
of "false positive ERP" by adopting a different form of analysis. The
problem is not just with the statistics, but with the use of statistics for what
are, in effect, unconstrained exploratory analyses. Researchers in this field urgently need
educating in the perils of post hoc interpretation of p-values and the
importance of a priori specification of predictions.


I've argued before that
the best way to teach people about statistics is to get them to generate their
own random data sets. In the past, this was difficult, but these days it can be
achieved using free statistical software, R. There's no better way of persuading someone to be less impressed by p
< .05 than to show them just how readily a random dataset can generate
"significant" findings. Those who want to explore this approach may
find my blog on twin analysis in R useful for getting started (you don't need
to get into the twin bits!).


The field of ERP is
particularly at risk of spurious findings because of the way in which ANOVA is
often used, but the problem of false positives is not restricted to this area,
nor indeed to psychology. The mindset of researchers needs to change radically,
with a recognition that our statistical methods only allow us to distinguish
signal from noise in the data if we understand the nature of chance.


Education about
probability is one way forward. Another is to change how we do science to make
a clear distinction between planned and exploratory analyses. This post was
stimulated by a letter that appeared in the Guardian this week on which I was a
signatory. The authors argued that we should encourage a system of
pre-registration of research, to avoid the kind of post hoc interpretation of
findings that is so widespread yet so damaging to science.






Reference



Simmons, Joseph P., Nelson, Leif D., & Simonsohn, Uri (2011). False-positive psychology Psychological Science, 1359-1366 DOI: 10.1037/e636412012-001



This article (Figshare version) can be cited as:
Bishop, Dorothy V M (2014): Interpreting unexpected significant findings. figshare.
http://dx.doi.org/10.6084/m9.figshare.1030406







PS. 2nd July 2013

There's remarkably little coverage of this issue in statistics texts, but Mark Baxter pointed me to a 1996 manual for SYSTAT that does explain it clearly. See: http://www.slideshare.net/deevybishop/multiway-anova-and-spurious-results-syt

The authors noted "Some authors devote entire chapters to fine distinctions between multiple comparison procedures and then illustrate them within a multi-factorial design not corrected for the experiment-wise error rate." 

They recommend doing a Q-Q plot to see if the distribution of p-values is different from expectation, and using Bonferroni correction to guard against type I error.



They also note that the different outputs from an ANOVA are not independent if they are based on the same mean squares denominator, a point that is discussed here:

Hurlburt, R. T., & Spiegel, D. K. (1976). Dependence of F Ratios Sharing a Common Denominator Mean Square. The American Statistician, 30(2), 74-78. doi: 10.2307/2683798

These authors conclude (p 76)

It is important to realize that the appearance of two significant F ratios sharing the same denominator should decrease one's confidence in rejecting either of the null hypotheses. Under the null hypothesis, significance can be attained either by the numerator mean square being "unusually" large, or by the denominator mean square being "unusually" small. When the denominator is small, all F ratios sharing that denominator are more likely to be significant. Thus when two F ratios with a common denominator mean square are both significant, one should realize that both significances may be the result of unusually small error mean squares. This is especially true when the numerator degrees of freedom are not small compared' to the denominator degrees of freedom.