The Iberian blackout demonstrated the importance of voltage control and reactive power, and how a weak grid, with poor controls, was brought down by a single faulty solar inverter. In this second part of my analysis of the Iberian blackout, I examine the specific technical causes of the incident. Where technical concepts such as frequency, voltage, and oscillation dynamics are not explained here, they are covered in Part 1, which outlines the physical principles and control challenges in modern grids.
This blog is based on a concise but informative report produced by Red Eléctrica de España (“REE”), the Spanish Transmission System Operator (“TSO”), which is more accessible than the much longer government report (available only in Spanish – rough English translation here).
The key messages from the REE report are:
- The blackout was triggered by a PV inverter–induced voltage oscillation
- Inappropriate disconnections of wind and solar generation, and widespread failure of reactive power support, escalated the disturbance
- REE relied on static controls and failed to deploy dynamic response assets
- Grid code non-compliance was widespread among renewables, conventional generators, and even REE itself (via non-compliant transformers)
- The collapse exposes systemic risks in low-inertia grids with high levels of inverter-based resources (“IBRs”) and inadequate voltage control
It is notable that, despite confident denials from some renewables advocates in the immediate aftermath, it was in fact a malfunctioning solar installation that triggered the voltage oscillation initiating the collapse. Wind and solar generators failed to meet fault ride-through obligations, and both inverter-based and conventional generators failed to provide the required reactive power support. Crucially, conventional generators did not trip prematurely – they remained online until system conditions breached their design tolerances.
The Iberian grid was already in a weakened state, owing to insufficient synchronous generation and excessive reliance on inverter-based renewables. The system failed to withstand a fault that originated with a single solar inverter. This was not an unavoidable technical event – it was the result of systemic underestimation of voltage control risks, poor compliance enforcement, and REE’s failure to schedule or deploy sufficient dynamic voltage support.
This blackout would not have occurred in a conventional, high-synchronous grid. The rush to decarbonise the power system without adequate attention to resilience and enforcement has created an atmosphere of complacency. That complacency – shared by policymakers, regulators, and parts of the renewables industry – led directly to a system-wide collapse that cost eleven lives.
Summary of the events leading up to the Iberian blackout
Below is the 18-point summary of events that can be found in the REE report, which is says are based on the best available information at the time of writing. These are described as having “occurred in succession and some of them individually could be assimilated to an N-1 scenario” (an “N-1 scenario” refers to a system’s ability to withstand the failure of a single component without causing a disruption in service. “N” represents the total number of components, and the “minus one” (N-1) indicates that one component is removed, either due to failure or planned outage. This principle is crucial for maintaining grid reliability and ensuring a continuous power supply):
- Forced oscillation at 0.6 Hz, possibly originating in a photovoltaic power plant in the province of Badajoz, triggers system-altering protocolized actions. Shunt reactors are operated, lines are coupled due to oscillations, and schedules are modified. (N-1)
- Natural oscillation at 0.2 Hz triggers further system-altering protocolised actions. Shunt reactors are operated, lines are coupled due to oscillations, and schedules are adjusted. (N-2)
- Generation under P.O. 7.4 does not absorb the required reactive power. (N-3)
- Variations in RCW generation during active power regulation affect voltage control and many of them don´t fulfil their obligations. (N-4)
- The conventional generation requested after the oscillations was not connected
- Generation loss in distribution: P < 1 MW and self-consumption of 435 MW before 12:32:57 (N-5)
- Inappropriate tripping of a generation transformer in Granada (N-6)
- Inappropriate tripping of solar thermal generation (Badajoz) and tripping of photovoltaic (Badajoz) without point-of-interconnection data from transmission network (N-7)
- Inappropriate tripping of a photovoltaic power plant connected also in the province of Badajoz but in a different transmission substation (N-8)
- Tripping of three wind farms (Segovia) without point-of-interconnection data from transmission network
- Tripping of one wind farm and a PV plant located at the province of Huelva, without point-of-interconnection data from transmission network
- Inappropriate tripping of photovoltaic power plant in Seville (N-9)
- Inappropriate tripping a PV generation located in the province of Cáceres (N-10)
- Tripping of PV generation connected to a 220 kV substation located in the province of Badajoz, without point-of-interconnection data from transmission network
- Tripping of one CCGT unit located at Valencia (N-11)
- Load shedding of pumping units and loads due to underfrequency results in increased system voltage
- The HVDC link operating in constant power mode continues exporting 1,000 MW to France
- Tripping of Nuclear Power Plant. (N-12)
This list misses out some transformer trips early in the scheme of events which I think should have been included.
More detailed assessment of the events of 28 April: Before the incident began
Earlier in the day, and indeed in the preceding days, there had been various frequency oscillations on both the Iberian grid and wider European power grids. These were described as normal, although I question whether they are desirable – it is normal to see cyclists running red lights in London, but it doesn’t make it any less illegal, or less of a threat to pedestrians. A wider lesson from this fiasco might be to consider whether these types of oscillations are signals of a weak grid and should not be tolerated the way that they are.
In any case, some of these fluctuations were attributed to the effects of solar ramping creating rapid changes in both the generation mix and supply-demand profile, with much of the solar being connected to lower voltage networks rather than the transmission system, and therefore looks like a reduction in demand to the high voltage grid. This phenomenon led to frequency deviations and voltage oscillations.
As noted in part 1 – these are not causal in the sense that frequency oscillations do not cause voltage oscillations or vice versa, despite phrasing in the REE report suggesting otherwise. The REE report was either not written by a native English speaker or translated from a Spanish language equivalent and these statements are likely to be mis-translations. Both frequency and voltage oscillations can be caused by the same grid disturbance but they do not cause each other (the sun can cause plants to grow and people to get sunburn, but plants growing does not cause sunburn and vice versa!)
In any case, these fluctuations in frequency and voltage caused overall voltage levels at various points on the grid to drop. As the voltage recovered a couple of transformers tripped, which REE speculated was a result of their taps not being adapted quickly enough in response to the increase in voltage.
Transformer taps are a means of changing the voltage transformation by bypassing some of the windings. A tap changer is the mechanical (or solid-state) device that switches between taps. There are two types:
- Off-load tap changers which can only be changed when the transformer is de-energised
- On-load tap changers which can adjust the voltage while the transformer is operating, which is crucial for real-time voltage regulation
In transmission and distribution networks, on-load tap changers are essential for maintaining voltage within required limits, particularly as load and generation fluctuate throughout the day.
At mid-day, the system was described as being compliant with normal operational procedures, notwithstanding the frequency oscillations of 0.2 Hz that had been identified. These were being damped at 20%.
REE says at this point there were no signs of the trouble to follow. I wonder if we are not ignoring underlying instability as possibly evidenced by these low frequency oscillations, an fooling ourselves into thinking the grid is stable, when fundamentally it only looks stable in a superficial way – statutory limits are being met but underneath the grid is actually quite weak.
Things start to go wrong just after mid-day
REE describes, the Genesis of the blackout as an atypical 0.6 Hz frequency oscillation that lasted 4 minutes and 42 seconds, which coincided with a reduction from 20% to 5% of the damping of the 0.2 Hz oscillation. Damping decay is logarithmic – with 20% damping the amplitude of the oscillation drops rapidly with about a 72% reduction in each cycle, however with 5% damping the amplitude decays much less, falling only around 28% per cycle.
This new 0.6 Hz oscillation caused a drop in voltage and a voltage oscillation with an amplitude of up to 35 kV (the report says 30 kV the cites a 375 – 410 kV range which is clearly 35 kV). This oscillation is described as being caused by a solar PV plant in Badajoz. 375–410 kV range refers to the fluctuating envelope of voltage during the oscillation. In other words, voltage was oscillating around a new (lower) midpoint, not just overshooting once and recovering. If the nominal voltage was 400 kV:
- The system began oscillating with a peak-to-peak amplitude of ~30–35 kV
- Meaning voltages temporarily swung between as low as 375 kV and as high as 410 kV
- That’s serious — most systems have tight voltage stability margins
These oscillations appeared most strongly on the interconnector corridor, which makes sense as it’s a long, low-damping HVDC route with significant power flow.
In response to this disturbance, REE enacted four pre-established measures:
- Coupling of 400 kV power lines to reduce system impedance
- Reduction of export capacity to France by 800 MW to 1,500 MW
- Setting the interconnector to France into constant power mode with a setpoint of 1,000 MW
- Disconnection of shunt reactors
Coupling lines reduces impedance which raises the voltage stability margin. This action is almost always a sensible response, as it
- Creates parallel paths for current, lowering the effective impedance
- This reduces voltage drop for a given current, improving voltage support
- It also helps damp oscillations, since more paths distribute power flows more evenly
Reducing exports by 800 MW relieves stress on voltage. This seems counterintuitive at first because the interconnector in question is dc and should be immune to frequency/voltage interactions, however the converter station on the Iberian side must draw real power from the ac grid and deliver it to the dc interconnector. If the grid is exporting 1,800 MW during a voltage sag, as voltage drops current must rise to maintain constant power export. This exacerbates the voltage drop and can overload nearby equipment since the interconnector is trying to suck power out of a grid that’s already sagging.
Therefore, reducing the export means less current is drawn from a weakened grid, relieving voltage stress. This also helps reduce power flow oscillations along long HV lines.
Switching to constant power mode at 1,000 MW might also appear to make things worse again. Before the voltage disturbance, the interconnector was running in voltage-dependent mode. When Iberian voltage dropped, export power fell slightly, helping to stabilise the grid. Switching to constant power mode at a lower setpoint fixes export at a manageable level, ensures France continues to receive power, avoiding disturbance propagation across the border, and keeps the converter operating within a known, controllable regime.
This can be an attractive option because once the load has been reduced and the voltage improved by other means, it is desirable for the interconnector to remain stable, avoid switching behaviour (voltage-following converters can respond unpredictably under low-inertia conditions), and not feed back instability into France. Flexibility is traded for stability.
As noted in my previous blog, shunt reactors absorb reactive power when voltage rises, helping to limit further voltage increases. Disengaging them means they stop limiting the rise in voltage, allowing it to increase more on the upward cycle, and therefore mitigates low voltage conditions. Impedance is lower when shunt reactors are disengaged.
While all of these actions seem sensible on the surface, they are static actions that were not adequate in what was a dynamic situation. These responses raise voltage in general, but do nothing to address the oscillation’s frequency content, timing mismatches or the phase angle instability across the network.
REE should have considered a more dynamic response, for example:
- dynamic voltage control via STATCOMs or fast synchronous condensers
- allowing interconnectors to respond dynamically (eg based on frequency or voltage magnitude and phase)
- using wide-area damping controllers to detect and suppress low-frequency oscillations
While REE’s actions were temporarily successful, the 0.6 Hz oscillation reappeared within a few minutes, with an accompanying voltage drop and voltage oscillations of 380-405 kV. As REE tried to dampen this recurrence of the 0.6 Hz oscillation, a new 0.2 Hz oscillation was detected. This was accompanied by voltage fluctuations of up to 28 kV at the Almaraz 400 kV substation.
At this point the 0.6 Hz frequency band was examined more closely and it was determined that there had been oscillations in this range since 10:30 that morning. Again, REE took the same static response measures (this time excluding the disconnection of shunt reactors). Transmission lines were coupled, exports to France were cut further and exports to Portugal were also reduced. As a result, voltage began to increase, although unevenly as these actions had different response times. At 12:22, REE began to connect shunt reactors to limit the voltage increase.
At the same time, there was an unexpected increase in demand which was later determined to be a reduction in generation connected to distribution networks. This was likely a response to negative intraday power prices making this generation uneconomic. This increase in transmission system demand led to a reduction of exports to France – which appears to be the main mechanism for balancing the system – decreasing the flow of power towards the interconnectors and causing an increase in voltage on the transmission system.
At this point, REE decided to call on some additional CCGTs to provide support to the network. Unfortunately the units were cold and required several hours to warm up and synchronise – they were not able to connect before the blackout started and ultimately were not used for that reason.
REE says that there were no over-voltages on the system at this point, but there were voltage oscillations.
Between 12:32:00 and 12:32:57 exports to France fell from 1,500 MW in a quasi-linear manner. This coincided with a number of events on the grid:
- IBRs reduced output in line with scheduled dispatch, leading to a reduction in reactive power actions since these units operate under power factor control, however many IBRs fail to meet their power factor obligations under the grid codes
- More disconnection of distribution-connected generation causing transmission system demand to rise further, amplifying the previous increase in voltage
- Transmission lines consumed and released less reactive power, reducing impedance and also increasing voltage
- Some conventional generators failed to meet their reactive power obligations under the grid codes, which had a significant impact on voltage
At 12:32:57 there was a trip on the 220 kV side of 400/220 kV generation transformer in Granada. A 400/220 kV generation transformer is a step-down transformer used to connect a power station that generates electricity at 400 kV to a lower-voltage 220 kV network. If a trip occurs on the 220 kV side it means a protection system detected a fault or abnormal condition on that side and automatically disconnected the transformer or circuit at the 220 kV terminals. This could have been an internal transformer fault, a fault in downstream 220 kV equipment (lines, breakers, or substations), or a protective system error (eg an overly sensitive relay or coordination fault).
If the 220 kV side is part of the generator’s plant this is likely a generator-side issue (eg fault in auxiliary transformers, local switchgear, or poor protection design). In this case, it may point to inadequate fault ride-through capability, or bad coordination with grid events. However, if the 220 kV side is part of the grid it’s a grid-side fault or protection event – the generator may have been exporting power, but the grid-side equipment or protections caused the disconnection. This would fall under the transmission system operator’s responsibility, or a distribution company if relevant. In practice, Spain’s grid is meshed and centrally operated, so REE likely owns and maintains the 220 kV side.
A trip on 220 kV side means the generator’s ability to export power is reduced or cut off. If the transformer was the only export path, the generator would be disconnected from the system even if the unit itself was operating normally. This might also increase stress on the 400 kV system, especially during oscillations, which is why the report highlights it.
This was a collector substation which is a type of substation used primarily in renewable energy installations to aggregate (or “collect”) electricity generated by multiple smaller generating units before sending it to the transmission or distribution network, eg lots of wind turbines or solar panels. The individual generation units are connected to a local internal network that feeds into a collector substation which then steps up the voltage (eg from 33 kV or 66 kV to 132 kV or 220 kV) and exports the combined power to the grid via a transmission or distribution substation.
REE speculates that the transformer tripped due to a faulty tap setting – as the system recovered from previously low voltages, the tap changers may not have responded quickly enough to the increasing voltage. Given the voltage levels at the time, the disconnection of the transformer meant it was not compliant with the grid codes.
19 seconds later, 727 MW of solar generation in the Badajoz region tripped off, reducing both active and reactive power. The grid was still operating within its normal operating parameters, so these units failed to meet their fault ride-through obligations. Within a second, three windfarms tripped in Segovia. Within the following seconds, more wind and solar tripped amounting to 834 MW in a 650 ms window. These units were in Badajoz, Huelva, Seville and Cáceres. Analysis of the RoCoF (rate of change of frequency) data suggest that 1,150 MW might have disconnected during this time.
These units were primarily located in the south of Spain, leading to a reduction in south-to-north flows, and causing the interconnector with France to change direction and begin to supply northern Spain.
By this time more than 2 GW of renewable generation had been lost from the grid across the various trips and unexpected price-based disconnections. The impact of the reduced reactive power control from these assets was exacerbated since a couple of conventional generators in the southern and central regions of Spain failed to meet their reactive power obligations under the grid codes.
In addition to the voltage impact of these disconnections, grid frequency fell. With each new disconnection, voltage increased and frequency decreased.
At 12:33:19 maximum imports from France were reached (3,807 MW) with 4,609 through the ac links. This statement is very confusing since on the REE website it says the Spain-France capacity is 2600/2700 MW (there is a different import vs export capacity). While the Baixas–Santa Llogaia HVDC link is commonly cited as having a 2,000 MW capacity, REE’s own data suggests that over 3,800 MW was flowing via “the HVDC network” at the time of the blackout.
This implies either a higher technical limit than advertised, or the presence of additional HVDC capacity not clearly documented in public sources (possibly Baixas-Vich). The ambiguity in published interconnector data — even from official sources — highlights how opaque Europe’s physical grid topology can be, despite its centrality to energy security. I have written to REE requesting clarification and will update this blog if I receive a satisfactory response.
In any case, the drop in grid frequency in Spain caused a lack of synchronisation with the French grid.
At this point, there is both load disconnection and pumped hydro trips. IBRs continue to trip with each new generator trip causing frequency to fall further. As frequency falls below 49.5 Hz, CCGTs begin to trip – it is notable that conventional generation (particularly gas units) did not trip until frequency fell below the proscribed limits ie unlike many renewable generators, they did not violate their fault ride-through obligations.
With frequency falling below 49.5 Hz, 2,000 MW of pumped hydro trips. When it falls to 48.3 Hz a further 588 MW of pumped hydro disconnects and load shedding of industrial consumers continues down to 49 Hz by which time 1,402 MW of load is lost. This disconnection of demand causes voltage to increase.
The interconnector with Morocco tripped at this point.
Then there are some further confusing comments about interconnection with France: “When the frequency reaches 48.46 Hz, the ac interconnection lines with France trip preventing expanding the incident further into France and to facilitate its availability for the restoration. The HVDC link, which was in constant power mode, does not disconnect and continues exporting 1.000 MW to France.”
Yet on the previous page we were told that the interconnectors were importing to Spain from France to compensate for the loss of generation in the south of Spain. And we were told the ac lines tripped (loss of synchronisation). If the ac lines had already tripped due to loss of synchronisation, how could they be said to trip again moments later? And how was the HVDC line suddenly exporting when the Spanish grid was so short the frequency was falling?
This may suggest that the constant power mode enacted on the HVDC in response to the initial 0.6 Hz oscillation over-rode the import mode and pushed the interconnector back into exports. If this was the case it was a very unhelpful setting that increased the stress on the Spanish grid.
Even more confusingly, after telling us the HVDC link did not disconnect, the report tells us that “from this moment onwards, the Spanish and Portuguese systems are isolated”.
In any case, at this time the frequency fell to 47.79 Hz causing a nuclear reactor and then several CCGTs trip and then we are told the HVDC line to France tripped, so at this point Iberia truly was islanded. At 12:33:24 the Spanish grid collapsed and at 12:33:27 voltage fell below 1 kV and there was a total system blackout.
A solar PV fault ultimately brought down the Iberian power grid
A solar inverter was the initial cause of the fault but poor grid code compliance and poor decision-making by REE made this a fatal fault. To summarise, the main events were:
- A PV inverter caused some frequency oscillations and a drop in voltage including voltage oscillations
- Two substations in Zaragosa tripped due to transformer taps not adapting fast enough to the increase in voltage after the system responded to the drop caused by (1). These substations at the 55 kV level do not appear to be owned by REE
- Renewables generators fail to respond to power factor requirements and conventional generators fail to respond to reactive power requirements set out in the grid codes
- REE fails to schedule enough thermal units for reactive power and initially responds to voltage issues with static rather than dynamic means (as we discussed yesterday with the car up the hill analogy)
- Lots of wind and solar (PV and thermal) trips inappropriately ie failing to meet fault ride-though obligations. No conventional generation trips until normal operating conditions are breached
- Voltage falls outside tolerances for conventional generation and frequency falls below 49.5 Hz causing these generators to start to trip further reducing frequency
- Cascading failure leads to full voltage collapse and blackout
As with the 2019 blackout in Great Britain this event was characterised by poor compliance with grid codes, although to a much more severe extent. Non-compliance appears to be widespread, particularly for renewable generators, many of which failed both the meet power factor obligations and fault ride-though obligations – the latter being more serious. Conventional generators failed to meet reactive power requirements, and even REE seems to have been operating non-compliant transformers.
In addition, REE made several serious errors (aside from not monitoring grid code compliance). It failed to schedule enough reactive power provision at the day-ahead stage. It relied too heavily on static voltage response in a dynamic situation (possibly because it did not understand what was happening in real time) and it failed to react fast enough when the system began to move outside normal parameters.
It is somewhat disappointing that REE does not address code compliance in its recommendations. The system operator is pushing for new additions to the Operating Procedure (the Spanish grid code) but does not address the question of compliance – these new obligations could and not be met by non-compliant generators.
Importance of enhanced voltage control in low-inertia grids
While the REE report states that the 28 April blackout “was not due to an inertia issue,” this shouldn’t be taken as exoneration of low-inertia system conditions. Rather, it exposes a more subtle and arguably more serious, systemic weakness – the erosion of voltage stability and dynamic controllability in high-IBR, low-synchronous environments.
In traditional grids, inertia provides a stabilising buffer against rapid frequency changes by passively resisting acceleration or deceleration of the grid’s rotational mass. But inertia alone doesn’t stabilise voltage – this is done through the ability of synchronous generators to supply reactive power, fault current, and voltage stiffness, often without needing explicit control actions.
As conventional generators are displaced by IBRs which lack rotating mass and don’t produce reactive power inherently, voltage becomes a weak point. While frequency tends to dominate public and regulatory discussions, voltage instability is often the faster and more dangerous mode of failure, and it played a key role in Spain’s blackout.
Critically, inverter-based resources disconnected before the frequency fell to critical levels, driven by voltage instability and local power quality. That in turn triggered a cascade – as solar and wind generation dropped off, frequency was pulled down and voltage also fell. The frequency reductions rather than voltage then caused conventional generators to trip as well, and the system spiralled into a full blackout.
This shows that:
- Voltage support is no longer a secondary consideration in high-IBR systems, it is essential to stability
- Voltage and frequency are coupled, but voltage excursions can occur first, and the loss of voltage support can lead to frequency collapse, not the other way around
- IBRs are not just passive victims of instability, they can actively contribute to it and even cause it if their controls are not properly tuned to the system they’re operating in
The Spanish blackout should be a wake-up call for system operators in any region with growing shares of non-synchronous generation, because it shows that low inertia magnifies other vulnerabilities. Even if low inertia isn’t the direct cause of a blackout, it reduces system stiffness and weakens voltage control, making the grid more susceptible to disturbances.
Stable frequency alone does not guarantee grid resilience – without proper voltage support and fault ride-through capability, especially from IBRs, a power system can appear healthy on one axis while becoming dangerously fragile on another. This is important when considering whether inertia limits can be lowered. And it is essential if considering wider use of inverter-based batteries for inertia support to ensure that they are fully grid code compliant. The compliance issues identified in GB in 2019 and now in Spain are a huge red flag that regulators should clamp down on.
While we talk about “inertia” as if it’s a single quantifiable value (usually in GVA·s), the term is often a proxy for a much wider set of stabilising properties that synchronous machines provide naturally, because if their physical and electrical characteristics. IBRs, particularly batteries can be very good at providing specific services such as fast frequency response, but they lack the default, location-sensitive, and passive characteristics that synchronous machines provide.
The REE report says the blackout wasn’t caused by “inertia”. Technically, it wasn’t caused by a low frequency event, but the system lacked synchronous damping, so the initial voltage instability was not naturally absorbed. It lacked voltage stiffness, so inverter mis-behaviour propagated rapidly, and it lacked a robust synchronous reference, which exacerbated control loop instability. So while “inertia” as a number might not have been the issue, the absence of synchronous behaviour absolutely was.
And it is no coincidence that when the fault propagated into high inertia France it was quickly contained and power to the area was rapidly restored, unlike in low inertia Spain which suffered a full collapse. This could be paraphrased as high-synchronous-generation France vs the high IBR grid in Spain.
Beware of the normalisation of deviance
The other key conclusion is that, just because a grid is operating within its statutory or grid code parameters, this does not mean it is stable, and to suggest it is could be seen as pretty complacent – in power system operation, “within limits” is not the same as “under control”. Grid code thresholds and statutory frequency or voltage bands are designed to define the outer boundaries of acceptable behaviour, not to guarantee stability – a system that is technically compliant but exhibits persistent oscillations, erratic inverter response, or poor damping, may be inherently unstable as Spain demonstrated on 28 April.
This is essentially the normalisation of deviance, a concept borrowed from engineering risk analysis and famously used in discussions about the Challenger disaster. It describes how the absence of immediate failure becomes evidence that the system must be safe, even if warning signs are accumulating. In her analysis of the Challenger disaster, sociologist Diane Vaughan used it to explain the repeated choice of NASA officials to fly the space shuttle despite a dangerous design flaw with the O-rings. Vaughan describes the phenomenon as occurring when people within an organisation become so insensitive to deviant practice that it no longer feels wrong. Insensitivity occurs insidiously and sometimes over years because disaster does not happen until other critical factors line up.
In this context, describing the 0.2 Hz oscillations as normal, is a literal example of the normalisation of deviance. In Spain, oscillations were present and had been for some time, damping was poor, and IBRs were failing to respond correctly, yet the system was declared “within parameters”. Sustained or growing oscillations indicate that the system is injecting energy into disturbances, and not absorbing or damping them. If damping rations are low then disturbances take longer to settle giving them more time to interact with protection systems, destabilise inverter controls, or trigger cascading responses.
So while the amplitude of the 0.2 Hz frequency swings were small, their persistence and poor damping were red flags.
A rush to net zero and a complacent attitude to voltage control and code compliance led Spain’s grid to collapse
The key messages we should take away from the Iberian blackout are:
- Poorly configured inverters can cause catastrophic failures in weak grids
- More attention needs to be paid to voltage control and not just managing frequency
- There needs to be better monitoring of grid code compliance
- The normalisation of deviance often leads to disaster
Fundamentally, TSOs, regulators and energy ministries need to ensure they are not so blinded by their net zero goals that they compromise grid stability, particularly by allowing a gradual erosion of standards that eventually exceeds what grids can cope with. Eleven people lost their lives in the Iberian blackout, so it is vital that the right lessons are learned to avoid any repeat either in Spain or elsewhere.
Thanks for all your work on this Kathryn.
Price too low
> As noted in part 1 – these are not causal in the sense that frequency oscillations do not cause voltage oscillations or vice versa, despite phrasing in the REE report suggesting otherwise. The REE report was either not written by a native English speaker or translated from a Spanish language equivalent and these statements are likely to be mis-translations. Both frequency and voltage oscillations can be caused by the same grid disturbance but they do not cause each other (the sun can cause plants to grow and people to get sunburn, but plants growing does not cause sunburn and vice versa!)
This is not quite true – Voltage and frequency (at the generator) ARE linked, in that the voltage comes from the physical speed of the magnets moving past eachother. Every motor/generator has an EMF constant, which defines the voltage produced per RPM (or rad/s). This is governed by strength of the rotating field – which in a power station is not really a constant, because the rotor field strength is set by the exciter – and it is adjusted by a control system to keep the voltage output stable at all times. However, there is a trade off in that the EMF constant also defines the Torque constant: If you increase the field strength in the rotor (to produce a higher voltage for the same RPM) then the torque produced per Ampere will also increase, which causes a strain on the turbine, extracting more power from its spinning inertia, i.e. slowing it, which causes a further reduction in both frequency and voltage.
My short take: emergent behaviour, in a complex and fast-changing electrical ecosystem.
“Within parameters under control” – just gold.
Congratulations on an amazingly detailed analysis that deserves several re-reads. Perhaps that last paragraph could be used as an “executive summary” for the political decision-makers who brought this unhappy state of affairs into being.
A terrific – and somewhat terrifying – piece of analysis, Kathryn. I particularly liked within limits is not the same as under control.
An excellent article. I think that both NESO and DESNZ should give us an assurance that a similar event will never happen here.
The event highlights that the electricity system is a complex dynamic machine not a passive trading system. In fact trading made the Spanish incident worse.
Many struggle with the simple concept of supply equalling demand at all times let alone grasping these complex technical issues.
Voltage and freqrency oscillations are not historically normal they are a recent developing function of weakening grids.
Protocols for operation of the grid by the book are insufficient. A deep technical understanding (backed by simulator training) by the grid operators is required to respond to more unpredictable circumstances. Actions can have complex unwanted results.
There is a complex interaction between frequency and voltage.
Frequency is contolled by the balance between customer demand and the supply of active power generation nationally.
But voltage is controlled by the balance of reactive power generation, customer requirements and requirements of the grid components themselves locally. The grid components can both supply and generate reactive power. This response varies greatly with power flow through the component and the frequency. The grid requirement dominates the customer requirement.
Coupling lines is not always an appropriate action. Although strengthening the grid from an inertia (frequency oscillation) point of view it may well dramatically increase system voltage.
The Spanish system was totally out of control and was not able to cope with any credible fault. Any actions taken would have both beneficial and detrimental effects.
The only action open was to increase the dynamic reactive power reserves. However in the hours leading to the incident an “experiment” to show the progress to decarbonisation was conducted. Gas fired and hydro plant (and Nuclear ) reactive reserves were taken off the system. It is interesting to note that since the incident all the nuclear has been restored and at least four times as much gas plant is running.
” I think that both NESO and DESNZ should give us an assurance that a similar event will never happen here.”
I’m not sure that’s possible when you don’t have a single entity in control, the commercial trading decisions in this incident illustrate the problem quite clearly.
Well written article – Electrical Grid managers world wide should read this report and adjust their systems for continued output and safety.
Thank you for your excellent explanation. While I didn’t understand it in depth I definitely got the gist of it. As a chemical engineer I’ve been in numerous incident investigations over my career. One thing that is obvious to me is that the root cause(s) were not explicitly identified. There is no or little accountability determined.
What is the cause of noncompliance with code? If they were compliant the incident may not have occurred. But why weren’t they compliant? Less than adequate auditing and enforcement? Caused by a culture that allows non compliance? Caused by less than adequate management? There are other paths that could be followed but this is the idea.
Another root cause could be inadequate integration of renewable energy sources into the grid. Caused by the need to meet government requirements. Caused by the need to save the planet. Caused by blindly adopting net zero requirements…. etc.
These two root causes (and there are others) point to issues in upper management and government. Therefore there’s little expectation that they will be remedied although somebody may get fired as a scapegoat. This incident could bring to the surface all kinds of actions that could prevent recurrence but when potential causes are eliminated hours after the incident (inadequate integration of renewables) it’s very unlikely there will be fundamental changes.
Thank you for your in depth explanation. While I didn’t understand it in depth I definitely got the gist of it. As a chemical engineer I’ve been in numerous incident investigations over my career. One thing that is obvious to me is that the root cause(s) were not explicitly identified. There is no or little accountability determined.
What is the cause of noncompliance with code? If they were compliant the incident may not have occurred. But why weren’t they compliant? Less than adequate auditing and enforcement? Caused by a culture that allows non compliance? Caused by less than adequate management? There are other paths that could be followed but this is the idea.
Another root cause could be inadequate integration of renewable energy sources into the grid. Caused by the need to meet government requirements. Caused by the need to save the planet. Caused by blindly adopting net zero requirements…. etc.
These two root causes (and there are others) point to issues in upper management and government. Therefore there’s little expectation that they will be remedied although somebody may get fired as a scapegoat. This incident could bring to the surface all kinds of actions that could prevent recurrence but when potential causes are eliminated hours after the incident (inadequate integration of renewables) it’s very unlikely there will be fundamental changes.
The most amazing thing about the 0.6Hz oscillation just after midday is that it was supposed to have originated at a solar farm producing 250MW (Planta fotovoltaica A a.k.a. Iberdrola’s Plata Solar Nuñez de Balboa), yet the oscillations in power exports to France were up to 900MW: something(s) was acting an an amplifier. If they had identified the source you might think they would have required it to shut down or limit operation severely. Yet instead it was allowed to increase output to 350MW at 12:15 just before the 0.6Hz oscillations re-emerged. They were cutting exports, but there is no mention of what generation was being cut to enable that.
I think the Granada substation transformer was gathering renewables generation at 220kV for delivery to the grid at 400kV (i.e. step up). As I understand it the gathering systems are in co-operative ownership among the contributing renewables generators, but the responsibility for the transformer is unclear: REE imply that the low voltage side was not theirs to manage.
Something I think the report omits to mention is that most of the pumped storage is in the mountainous North, so there was almost 3GW coming up from the South just to power it, not just the export flows to France. They needed a much better understanding of how disconnection would affect the grid.
I presume we will be hearing more from ENTSO-E in due course, including how they view the 0.2Hz oscillations which have been a grid feature for over a decade. We have this (ironically published at almost the same time:
https://www.entsoe.eu/news/2025/07/16/28-april-blackout-in-spain-and-portugal-expert-panel-releases-new-information/
With this summary and timetable for further expert meetings:
https://www.entsoe.eu/publications/blackout/28-april-2025-iberian-blackout/
They are placing emphasis on issues of voltage control, which they describe as novel. I’m surprised (well, perhaps not) that they are ignoring the power oscillation issue to a large degree.
This event exposes a fundamental issue which is that we now have power grids where the architecture is defined more by the markets than by engineers.
If, for the grid to be stable, generation needs to cone with a certain amount of embedded inertia then surely that is what should be being bought? Similarly, if generation needs to exhibit certain characteristics with respect to voltage control / reactive power then again, that is what should be bought.
It is not difficult to make renewable generation mimic rotating machines, adding energy storage in the form of batteries or supercaps is not difficult nor any more expensive than buying it separately.
What has been allowed to happen is that generators delivering only a portion of the functionality previously expected from generators have been allowed to connect to the grid and receive the same renumeration as those generators who do provide the full.package.
There has been too much focus on energy delivered, not on control.
“relied on static controls and failed to deploy dynamic response assets” is quite a red flag to start with.
Some options for more dynamic response were listed, but I wonder if these are available at all or in sufficient quantity or in the right places. No use in having something that could help if it’s on the wrong side of something that’s just tripped and disconnected it. Plus even if they are available have they been sized for a historic high inertia grid and not updated as the mix has changed?
I get the overal impression of a large complex system and a level of complacency, some of which could be due to behaviours of a historically high inertia system where certain things were taken for granted/not understood/improperly accounted for. With less spinning machines and more IBR’s the overall behaviour may be surprisingly different from historic understanding. Using historic understanding to manage an event – manually – when that understanding no longer applies…
I’m also suspicious of IBR’s not meeting expectations and wonder how nuch of that is due to the control system being a piece of software written using an ‘agile’ methodology (i.e. never finished and constantly being iterated) instead of being governed by the physics behind the spinning machine. This being especially relevant when the software engineer writing the code likely doesn’t have appropriate electrical knowledge/background and even less likely to have grid scale experience, and that’s before you even consider that it was probably outsourced to the lowest cost bidder. The UK thing a few years back where the trains tripped due to frequency(?) and then not being restartable without an engineer call-out comes to mind – theory vs practical application in a real world complex situation.
In general though, meeting expectations of the grid code may be hard to confirm as any particular generators ability may be dependant in some ways on the conditions on the grid it’s connected to. You can model all sorts of scenarios, it’s the ones you haven’t thought about or have deemed unlikely that should be the worry.
The mis-understanding of the risks and normalisation of deviance is an interesting one. We’re generally not very good at understanding risks and probabilities and when commercial decisions take precedence they all tend to get lumped under unjustifiable costs, which in turn conditions us further along that path as the risk isn’t taken seriously until after the fact.
One characteristic of Emergent Behaviour, which arises when a sufficient number of apparently innocuous components interact in a dynamic ecosystem, is that such behaviour cannot be predicted from even deep analysis of any one component.
Thus ‘compliance’, ‘regulation’ and suchlike, which all focus on individual components in the expectation that having them ‘compliant’ rules out surprises, is addressing the wrong issue. It might be necessary but is not sufficient.
Just as the behaviour of an ant farm can’t be predicted by dissecting ants, nor can a large grid’s behaviour be predicted by specifying IBR characteristics.
‘Within parameters’ does not imply ‘under control’…..
Following on from these blogs on the Iberian blackout, here is a video aimed at a more lay audience. I still attempt to explain some of the physics (I can’t help myself!) and hope people find it interesting…
https://www.youtube.com/watch?v=dVJjlwMnz-Y
Every professional in electrical engineering knew immediatly how this happened. Hawaii is a good example of a solar climate, their grid became unstable as more and more solar was installed. They have to keep their fossil fuel electrical generators running at idle ready to pick up the load whenever a cloud comes over or similar event. Those with solar are curtailed to some extent and also have a fee related to their free use of the grid, for any excess, while they cash in their solar credits when the sun doesn’t shine. Spain has to do something similar or it will happen again, unless regulators force them to abide by methods designed to avoid what happened. Where were all the professionals when Spain blacked out, telling every one what happened? This post is what happened!! Because its more about politics and money and getting the grants, credits etc. and unethical professionals who say nothing.
An outstanding report we have come to expect from Kathryn. Its implications should be far reaching, certainly for the GB grid system that is no longer sustainable,, given intent.
For the Iberian incident some attention is needed to promote the initial cause being traced to a faulty inverter at Badajoz substation as there must be thousands of other inverters fitted to to IBR generation locations throughout their system. A critical factor is no acceptance beforehand of the precarious condition of the Spanish grid. It had been sufficient to tolerate such deviations so long as operational standards were being maintained. With any major grid fault there are always unwelcome surprises with sympathetic trips, best catered for by having an excess of inertia support. The attempt to explore GVA boundaries is futile as overtime, continual system parameters change.
With hindsight it is unsurprising to reveal compliance failures with ageing protection although not on the scale indicated by the report. Nothing can be absolute in a technical sense with any equipment. The transmission grid is a dynamic beast, unlike the previous static distribution sector that has now been converted to extend its nature to become dynamic with the scale of IBR generation, a notable factor of the blackout being the rapid rate of loading dispatch.
As the report indicated the key requirement is to maintain a high level of inertia to avoid any repeat of this incident. This raises the question as to what level. The present GVA level is essentially arbitrary and it seems to be without any real time indication for operational control. Clearly some compromise with existing renewable development is forced to be made but it is essential to prevent further encroachment of any RCW generation. Inevitably other grid systems are vulnerable.
The position of the UK is not as advanced as Iberia though similar. I understand load changes are from CCGT plant although would question the increasing use of renewables given the scale of constraint payments underway, smart meter adjustments and certainly with rolling power cuts. There is the conflict between cost and security to consider. Other factors are interconnection variability and its continued access.
The greatest danger will arise from political decision to treat the symptoms and not the cause in order to avoid accountability for discredited renewable policy under Net Zero. Present decision making is entirely unsuitable as has been obvious for well over a decade. The repercussions of any system collapse when combined with having the highest energy bills in the developed world and a shrinking manufacturing sector would be terminal. To any engineer the Watt-Logic report is clear and decisive, I fear under existing arrangements its further implementation will become a betrayal.
Thank you for this detailed examination. There is a larger issue that has not been discussed here. The merit order for Spanish generation resources was changed by government decree. A unique Spanish tax on nuclear power played a very important role in degrading the frequency performance of the Spanish electricity grid. The PSOE is the head of a Spanish coalition government with a narrow ruling margin. This PSOE tax (impuesto) was the reason why a pair of nuclear reactors were off-line on the morning of 28 April 2025.
The Spanish nuclear tax was a method to obtain additional subsidies for solar and wind generation, both of which produce low-quality, intermittent electric power. Those taxes are substantial, now accounting for about half of the cost of some Spanish nuclear power plants. The taxes are on the production of safe, high-quality, reliable, abundant, and typically cost-effective nuclear power. The Spanish nuclear tax made nuclear power relatively expensive, altering the loading order based on the principle of lowest-cost dispatch by Red Electrica.
For additional details, please see this 07 July 2025 GreenNUKE Substack article, “The Spanish Version of the ‘Duck Curve’ is a real killer – This curve underscores the problem of insufficient synchronous grid inertia in Spain on April 28, 2025.” Please search for it by the first few words of the title.
Just in case this information is not known it was issued 19 October 2023 with a view to becoming mandatory on 6th November 2026. The crucial part of the FERC order is the Performance requirements, starting on page 178.
I suspect this document is behind why the US has ditched renewables.
E-1-RM22-12-000 | Federal Energy Regulatory Commission
The conclusion is that protection relay worked as they were supposed to protect the line, transformers and the rest of the equipment. Reactive power compensation was not met by the power electronics. There was not frequency respond. All of these because turbines and generators correct for the grid inertia. There is a limit to the number of renewables you can put in line with traditional sources for them to work properly. This limit is <30%. I know what they are going to do right now. They are going to re calibrate the protection relay to delay the tripping.
The remedy will be worst than the illness.