StackStalk
  • Home
  • Java
    • Java Collection
    • Spring Boot Collection
  • Python
    • Python Collection
  • C++
    • C++ Collection
    • Progamming Problems
    • Algorithms
    • Data Structures
    • Design Patterns
  • General
    • Tips and Tricks

Wednesday, February 27, 2013

Longest compound word (LCW) from a list of words

 February 27, 2013     Programming problems in CPP     No comments   

Write a program to find the longest compound word (LCW) from a list of words

For example, given string list, "hello" and "ball", "world", "helloworld", "morning", "helloball", the LCW is helloworld. In the below solution we use trie, though there are other ways to solve. Follow this C++ Tries to understand more about implementing Tries.

Approach to find the LCW

  1. Traverse the Trie to find initial set of words which have valid substrings (from the word beginning) and create pair (main word:foundpart) and append the pair to a queue. For example in the above list, hello does not make it to the list but helloball makes it to the list with the pair "helloball":"hello"

  2. Now process one by one pairs. See if the remaining part of the main word (that is by removing the found part) has any valid words in the list. If found add the new pair to the queue.

  3. Whenever the found part is same as the main word, we found a compound word. Keep track of the best found compound word from their length point of view

Above we have assumed that the full list of words was not available and we have to extract them from the given trie. If the list was available then we could have avoided the first step (below function fillInitialCWs) and instead we could have added all the list words to the queue with a pair as word:"". That is empty string as foundpart.

Also note that in fillInitialCWs, instead of preparing the list of words, we have advanced by one step to prepare the potential compound words directly!

C++ program to find the LCW

#include <iostream>
#include <vector>
#include <queue>
#include <list>

using namespace std;

struct stringpair {
string full;
string foundpart;
stringpair(string afull,string afoundpart):full(afull),foundpart(afoundpart) {}
};


class Node {
public:
Node() { mContent = ' '; mMarker = false; }
~Node() {}
char content() { return mContent; }
void setContent(char c) { mContent = c; }
bool wordMarker() { return mMarker; }
void setWordMarker() { mMarker = true; }
Node* findChild(char c);
void appendChild(Node* child) { mChildren.push_back(child); }
vector<Node*> children() { return mChildren; }

private:
char mContent;
bool mMarker;
vector<Node*> mChildren;
};

class Trie {
public:
Trie();
~Trie();
void addWord(string s);
void deleteWord(string s);
void fillInitialCWs(Node *node, string currentWord);
string findLCW();
void checkRemainingPart(string full, string foundPart);

private:
Node* root;
list<string> mLCWStack;
queue<stringpair*> mProcessingPairs;
string mLCW;
};

Node* Node::findChild(char c)
{
for ( int i = 0; i < mChildren.size(); i++ )
{
Node* tmp = mChildren.at(i);
if ( tmp->content() == c )
{
return tmp;
}
}

return NULL;
}

Trie::Trie()
{
root = new Node();
}

Trie::~Trie()
{
// Free memory
}

void Trie::addWord(string s)
{
Node* current = root;

if ( s.length() == 0 )
{
current->setWordMarker(); // an empty word
return;
}

for ( int i = 0; i < s.length(); i++ )
{
Node* child = current->findChild(s[i]);
if ( child != NULL )
{
current = child;
}
else
{
Node* tmp = new Node();
tmp->setContent(s[i]);
current->appendChild(tmp);
current = tmp;
}
if ( i == s.length() - 1 )
current->setWordMarker();
}
}


void Trie::fillInitialCWs(Node *node, string currentWord)
{
bool validWord = false;
vector<Node*> children = node->children();
char content = node->content();
string newWord = currentWord.append(1,content);
if ( node->wordMarker() ) {
validWord = true;
if (mLCWStack.size() > 0) {
for (std::list<string>::iterator it = mLCWStack.begin(); it != mLCWStack.end(); it++) {
mProcessingPairs.push(new stringpair(newWord,*it));
}
}

mLCWStack.push_front(newWord);
}
for ( int i = 0; i < children.size(); i++ )
{
Node* tmp = children.at(i);
fillInitialCWs(tmp,newWord);
}

if (validWord) {
mLCWStack.pop_front();
}

}

void Trie::checkRemainingPart(string full, string foundPart)
{
string remPart = full.substr(foundPart.length(),full.length() - foundPart.length());
Node* current = root;
string currentWord = foundPart;

for ( int i = 0; i < remPart.length(); i++ )
{
Node* tmp = current->findChild(remPart[i]);
if ( tmp == NULL )
return;
currentWord.append(1,remPart[i]);
if ( tmp->wordMarker() )
mProcessingPairs.push(new stringpair(full,currentWord));
current = tmp;
}

}

string Trie::findLCW()
{
// traverse the trie and fill the first set of compound words.
fillInitialCWs(root,"");
mLCW = "";
int currentLCWLength = 0;
while (!mProcessingPairs.empty()) {
stringpair* it = mProcessingPairs.front();
mProcessingPairs.pop();
if (it->foundpart == it->full ) {
if ( (it->full).length() > currentLCWLength ) {
currentLCWLength = (it->full).length();
mLCW = it->full;
}
} else {
checkRemainingPart(it->full, it->foundpart);
}
}
return mLCW;
}

// Test program
int main()
{
Trie* trie = new Trie();
trie->addWord("hello");
trie->addWord("balloon");
trie->addWord("ball");
trie->addWord("world");
trie->addWord("helloworld");
trie->addWord("helloxworld");
trie->addWord("helloworldyball");
trie->addWord("helloworldball");
trie->addWord("helloworldballoon");
trie->addWord("helloworldballoonxyz");
trie->addWord("morninghelloworldball");


string LCW = trie->findLCW();
cout << "Longest Compound World: " << LCW << endl;

delete trie;
}

Output:-
Longest Compound World:  helloworldballoon
Email ThisBlogThis!Share to XShare to Facebook

Related Posts:

  • Longest common substring problemWrite a program to find the longest common sub-string of two given strings The approach:- Key to the LCS problem is to generate a LCS matrix based … Read More
  • Reverse words of a string using stackWrite a program to reverse words of a string using a stack The simplest approach to reverse words of a string is using a stack. To understand more ab… Read More
  • Compress a stringWrite a program to compress a string. If input is "aabbb" expected output is "a2b3" The approach:- Have 2 pointers read and write to the input st… Read More
  • Replace spaces in a string with "%20".Write a program to replace spaces in a string with %20 The approach: Typically in this problem we are expected to use optimal memory. This requires … Read More
  • Find if two strings are anagramsWrite a program to find if two strings are anagrams Anagram - Wikipedia An anagram is a type of word play, the result of rearranging the letters of a… Read More
Newer Post Older Post Home

0 comments:

Post a Comment

Follow @StackStalk
Get new posts by email:
Powered by follow.it

Popular Posts

  • Python FastAPI file upload and download
    In this article, we will look at an example of how to implement a file upload and download API in a Python FastAPI microservice. Example bel...
  • Avro Producer and Consumer with Python using Confluent Kafka
    In this article, we will understand Avro a popular data serialization format in streaming data applications and develop a simple Avro Produc...
  • Monitor Spring Boot App with Micrometer and Prometheus
    Modern distributed applications typically have multiple microservices working together. Ability to monitor and manage aspects like health, m...
  • Server-Sent Events with Spring WebFlux
    In this article we will review the concepts of server-sent events and work on an example using WebFlux. Before getting into this article it ...
  • Accessing the Kubernetes API
    In this article, we will explore the steps required to access the Kubernetes API and overcome common challenges. All operations and communic...
  • Python FastAPI microservice with Okta and OPA
    Authentication (AuthN) and Authorization (AuthZ) is a common challenge when developing microservices. In this article, we will explore how t...
  • Scheduling jobs in Python
    When developing applications and microservices we run into scenarios where there is a need to run scheduled tasks. Examples include performi...
  • Using Tekton to deploy KNative services
    Tekton is a popular open-source framework for building continuous delivery pipelines. Tekton provides a declarative way to define pipelines ...

Copyright © 2025 StackStalk