[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Sheflug] Weird algorithm question
On Fri, Jun 25, 2010 at 04:57:39PM +0100, Alex Hudson wrote:
> On 25/06/10 16:37, Adam Funk wrote:
>> I also posted this to the corpora mailing list (related to natural
>> language processing) and got some interesting suggestions about graph
>> search algorithms.
>>
>
> Yeah; thinking about it, it's not really travelling salesman because
> you're looking for the longest route, not the longest route visiting all
> nodes (as I understand the problem, anyway).
>
> Thinking about this, there ought to be some MapReduce magic you could
> work here. If there are "loops" where you have start and end nodes with
> multiple "routes" between them but only two vertices per node, e.g.:
>
> --A--C--E--T--B--
> \----F--G----/
>
> .. then you can automatically throw away the A->F->G->B route because
> it's never going to be part of any longest route.
>
> I don't know how much of that kind of pruning you can actually do, but I
> would have thought that could help cut down the search space pretty
> usefully.
>
> It's still going to be a pretty crazy complex problem to solve tho ;)
On the other hand, the OP's dataset sounds like more like a simple tree,
in which case a simple O(N) algorithm should work:
#!/usr/bin/perl -w
my %parent;
my %nonterminal;
while (<DATA>) {
chomp;
my ($child,$parent) = split /,/, $_;
$parent{$child} = $parent;
$nonterminal{$parent} = 1;
}
for my $base (keys %parent) {
next if $nonterminal{$base};
my $n = $base;
while (defined $n) {
print "$n,";
$n = $parent{$n};
}
print "\n";
}
__END__
100001930,100001740
100002137,100001740
100002452,100001930
100002684,100001930
100003553,100002684
100003993,100003553
100004258,100003553
100004475,100004258
100005787,100004475
100005930,100004475
outputs
100002452,100001930,100001740,
100005930,100004475,100004258,100003553,100002684,100001930,100001740,
100005787,100004475,100004258,100003553,100002684,100001930,100001740,
100002137,100001740,
100003993,100003553,100002684,100001930,100001740,
--
Nothing ventured, nothing lost.
_______________________________________________
Sheffield Linux User's Group
http://sheflug.org.uk/mailman/listinfo/sheflug_sheflug.org.uk
FAQ at: http://www.sheflug.org.uk/mailfaq.html
GNU - The Choice of a Complete Generation