Thursday, March 16, 2017

PowerShell - Add line numbers to a file & replace a string

I had a simple requirement of pre-pending line numbers to every line in a file and at the same time prepending every double-quote(") with another double-quote(""). However, with dos / powershell where there ares not many ways to accomplish this and, with a LARGE file, it becomes slow and cumbersome.

My first attempt on a 13 MB file, took ~15 mins to process.  However, the below code took < 2 mins for the same file.

The code is a powershell script which takes as input parameters the source and target file paths.

$i=0
$outfile = $Args[1]
del $outfile
Get-Content -ReadCount 100000 $Args[0] | foreach {
    for ($j=0;$j -lt $_.count; $j ++) {
        $_[$j] = "$(($i*100000+$j) + 1): ".PadLeft(10) + $_[$j] -replace "`"","`"`"" 
    }
    $i++
    $_ | Out-File -Encoding ascii $outfile -append
}



Some Notes:

1) By default the Get-Content command will operate one row at a time.  VERY SLOW.
2) Adding the -ReadCount #n parameter tells the Get-Content command to loops through the input file in chunks of size #n; loading each chunk into memory.
3) To use it effectively, be aware that it returns a $_ variable which is an array of size #n.  To operate on each loop you have to pipe this into a foreach statement and operate on each item in the array as demonstrated above.



No comments:

Post a Comment