I had a simple requirement of pre-pending line numbers to every line in a file and at the same time prepending every double-quote(") with another double-quote(""). However, with dos / powershell where there ares not many ways to accomplish this and, with a LARGE file, it becomes slow and cumbersome.
My first attempt on a 13 MB file, took ~15 mins to process. However, the below code took < 2 mins for the same file.
My first attempt on a 13 MB file, took ~15 mins to process. However, the below code took < 2 mins for the same file.
The code is a powershell script which takes as input parameters the source and target file paths.
$i=0
$outfile = $Args[1]
del $outfile
Get-Content -ReadCount 100000 $Args[0] | foreach { for ($j=0;$j -lt $_.count; $j ++) {
$_[$j] = "$(($i*100000+$j) + 1): ".PadLeft(10) + $_[$j] -replace "`"","`"`""
}
$i++
$_ | Out-File -Encoding ascii $outfile -append
}
Some Notes:
1) By default the Get-Content command will operate one row at a time. VERY SLOW.
2) Adding the -ReadCount #n parameter tells the Get-Content command to loops through the input file in chunks of size #n; loading each chunk into memory.
3) To use it effectively, be aware that it returns a $_ variable which is an array of size #n. To operate on each loop you have to pipe this into a foreach statement and operate on each item in the array as demonstrated above.