Powershell: -split With \G Stops After First Split

Created on 17 Nov 2020  路  5Comments  路  Source: PowerShell/PowerShell

Steps to reproduce

"ABCDEFGH" -split '(?<=\G..)(?=..)'

Expected behavior

In PS 7.0.3 and earlier, this is the output:

AB
CD
EF
GH

Actual behavior

In PS 7.1.0, this is the output:

AB
CDEFGH

Workaround

This works in PS 7.1.0:

[regex]::Matches("ABCDEFGH", "..") | % Value

Environment data

Name                           Value
----                           -----
PSVersion                      7.1.0
PSEdition                      Core
GitCommitId                    7.1.0
OS                             Microsoft Windows 10.0.19042
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0鈥
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0
Issue-Question Resolution-External Waiting - DotNetCore

All 5 comments

Removing \G from the pattern restores multiple splits, which suggests that the \G anchor indeed is the source of the unexpected behaviour.

"ABCDEFGH" -split '(?<=..)(?=..)'
#                      ^
#                    No \G

NOTE: Below is not the desired output; it is shown here only to illustrate the behaviour without \G.

AB
C
D
E
F
GH

@sharpjs Thanks for your report!

This broke in PowerShell 7.1 Preview1 after we moved to .Net 5.0. PowerShell uses .Net Regex so it is .Net issue.

I see you have an experience in .Net - please create a simple C# demo, open new issue in .Net Runtime repository and reference the issue for tracking.

Just to offer a pragmatic workaround in the meantime: "ABCDEFGH" -split '(..)' -ne ''

Fixes in .NET runtime repo:
6.0.0: https://github.com/dotnet/runtime/pull/44975 (merged)
5.0.x: https://github.com/dotnet/runtime/pull/44985 (WIP as of this writing)

@sharpjs Thanks!

Was this page helpful?
0 / 5 - 0 ratings