Powershell RegEx to parse EDI Files

NOTE: If have a newer blog that uses what is demonstrated in this blog to format and rename all EDI files in a disk directory

The one thing I didn’t account for (yet), is the case where different files can have different EDI delimiters. Technically, you should look for the end of the ISA segment to find the delimiters, and use those in the RegEx match. For now, I’m assuming the field delimiter is * and the segment delimiter is the tilda (~).

Requirement

I was archiving the EDI files in BizTalk with the filename set to “%datetime%_%MessageID%_EDI.txt”. I decided it would be better to name the files COMPANYNAME_DOCTYPE_ORDERNO_ORDERDATE_%datetime%_%MessageID%_EDI.txt.
NOTE: I could have done this logic in a custom C# BizTalk Pipeline, but decided to do it after the fact with a more simple Powershell than would be easier for administrative staff to maintain and update.

Sample 1 – Just test the parsing

With this sample, you can copy the contents of a file into the $ediText string, and test.

<pre>
cls
#Note subsituted " with `" in the string to escape the quotes within quotes issue 
$ediText = "ISA*00*          *00*          *ZZ*MYCUSTOMER*ZZ*MYCOUNTRY*170823*1610*U*00401*000000117*0*T*:~GS*PO*BTS-SENDER*RECEIVE-APP*170823*1610*117*T*00401~ST*850*0117~BEG*00*NE*391949**20170828~N1*BY*DELIVERY-ADDRESS~N1*ST*DELIVERY-ADDRESS~N3*1420 MAINSTREET DR~N4*DALLAS*TX*12345~PO1*1*5.00*EA*4.350**IN*106889~PID*F****SAND MIX ( SSM80 )~PO1*2*1.00*etc...~"; 

$CompanyID  = [regex]::match($ediText,'.*ISA\*.*?\*.*?\*.*?\*.*?\*.*?\*(.*?)\*.*').Groups[1].Value
$OrderNum   = [regex]::match($ediText,'.*BEG\*.*?\*.*?\*(.*?)\*.*').Groups[1].Value
$OrderDate  = [regex]::match($ediText,'.*BEG\*.*?\*.*?\*.*?\*.*?\*(.*?)[~\*].*').Groups[1].Value
$EdiDocType = [regex]::match($content,'.~ST\*(.*?)[~\*].*').Groups[1].Value

Write-Host "CompanyID = $CompanyID"; 
Write-Host "OrderNum = $OrderNum"; 
Write-Host "OrderDate= $OrderDate"; 
Write-Host "EdiDocType= $EdiDocType"; 
</pre>

Sample 2 – Renaming Files Based on EDI Key Fields

<pre>
cls

$DirName = "d:\BizTalk\EDIHorizon\Archive\EDI850Order\"

#only rename files that start with the year, 2017, 2018, etc...  thus 20*.txt 
Get-ChildItem $Dirname -Filter 20*.txt | 
Foreach-Object {

    $fullname = $_.FullName.ToString();  
    $dirname = $_.Directory.ToString(); 
    $filename = $_.Name.ToString(); 

    Write-Host "OldName $fullname"
    $content = Get-Content $_.FullName

    $CompanyID  = [regex]::match($content,'.*ISA\*.*?\*.*?\*.*?\*.*?\*.*?\*(.*?)\*.*').Groups[1].Value
    $OrderNum   = [regex]::match($content,'.*BEG\*.*?\*.*?\*(.*?)\*.*').Groups[1].Value
    $OrderDate  = [regex]::match($content,'.*BEG\*.*?\*.*?\*.*?\*.*?\*(.*?)[~\*].*').Groups[1].Value
    $EdiDocType = [regex]::match($content,'.~ST\*(.*?)[~\*].*').Groups[1].Value
    Write-Host "$OrderNum $OrderDate"

    Write-Host "Filename=$filename"
    $newFileName = $dirname + "\" + $CompanyID + "_" + $EdiDocType + "_" + $OrderNum + "_" + $OrderDate  + "_" + $filename
    Write-Host "NewName $newFileName`n" 
    Rename-Item $fullname $newFileName 
}
</pre>

Having a filename like this will make it faster to search the archives for certain types of orders or files from a certain partner, or do do quick counts, based on the filename alone. For example, how many files did we get from XYZ company yesterday and today? This could be done in BizTalk with BAM as well, but my current client opted out of the overhead and complexity of BAM, especially since BizTalk was (for the most part), just passing the files around, not creating them.

The variable $EdiDocType above represents something like and 850, 855, 856, 810, 997, etc…

I might add one more feature. Many of the trading partner don’t use name, but some Dun number, phone number, or other ID number. I might have a lookup table to translate the code to a shortname that represents that trading partner.

Uncategorized  

Leave a Reply