Re: [Wlug] awk help

March 14, 2017

      ...
...
...
...
...
"Mike" == Mike Peckar <fog@fognet.com> writes:
Mike>     This seemed like it should be simple, but I’m at wits end. I
Mike>     simply want to find duplicates in the third column of a csv
Mike>     file, and output the duplicate line _and_ the original line
Mike>     that matched it. There’s a million examples out there that
Mike>     will output just the duplicate but not both.

Mike>     In the data below, I’m looking for lines that match in the 3^rd column…

The sorting part is easy...

    sort -k 3 -t "," <file>

Now to find the duplicates... I'd probably jump to a perl script:

perl -e '@a=(<>);foreach @a {@t=split(",",$_) {push @{$t{$t[2]}},$_; } foreach
sort (keys %t) { if ($#{$t{$_}} > 0) { print @{$t{$_}}, "\n";}}'

Should also do the right thing.  First it splits into keys, stuffs the
line into an assoc array of arrays.  Then it sorts the assoc array and
prints out those with more than one entry in it.

Admittedly done off the top of my head, without any actual testing.  :-)

Mike>     Normal,Server,xldspntc02,,10.33.52.185,

Mike>     Normal,Server,xldspntc02,,10.33.52.186,

Mike>     Normal,Server,xldspntc04,,10.33.52.187,

Mike>     Normal,Server,xldspntcs01,10.33.16.198,

Mike>     Normal,Server,xldspntcs01,,10.33.16.199,

Mike>     Normal,Server,xldsps01,10.33.16.162,

Mike>     Normal,Server,xldsps02,10.33.16.163,

Mike>     My desired output would be:

Mike>     Normal,Server,xldspntc02,,10.33.52.185,

Mike>     Normal,Server,xldspntc02,,10.33.52.186,

Mike>     Normal,Server,xldspntcs01,10.33.16.198,

Mike>     Normal,Server,xldspntcs01,,10.33.16.199,

Mike>     $ awk -F, 'dup[$3]++' file.csv

Mike>     I played around with the prev variable, but could not pumb it out fully,
Mike>     e.g { print prev }

Mike>     Mike

Mike> _______________________________________________
Mike> Wlug mailing list
Mike> Wlug@mail.wlug.org
Mike> http://mail.wlug.org/mailman/listinfo/wlug

John Stoffel

tags

participants (1)