The HackerEarth Find the Number problem is a classic searching task that tests whether a program can identify the presence of given values efficiently. Although the statement is usually simple, the challenge is to choose the right data structure so that multiple lookups do not become unnecessarily slow. A clean solution focuses on preprocessing the input numbers and answering each query in the fastest practical way.
TLDR: The problem asks the program to determine whether certain queried numbers exist in a given list. The most efficient common approach is to store all numbers in a hash set and check each query in expected constant time. This reduces the overall complexity to O(N + Q), where N is the number of elements and Q is the number of queries. A sorting and binary search method also works, but it is usually slightly slower for this use case.
Understanding the Problem
In the typical version of HackerEarth Find the Number, the program receives an array or list of integers. After that, it receives one or more query values. For each query, the program must print whether the queried number is present in the original list.
The output is usually something like YES if the number exists and NO if it does not. The exact capitalization may depend on the platform statement, so the submitted code should match the required output format exactly.
For example, suppose the given list is:
5
1 4 7 9 12
3
4
8
12
The queries are 4, 8, and 12. Since 4 and 12 exist in the list, their answers are YES. Since 8 is missing, its answer is NO.
YES
NO
YES
Why a Simple Linear Search Is Not Enough
The most direct idea is to scan the whole array for every query. If the queried number is found, the program prints YES; otherwise, after checking all elements, it prints NO.
This works correctly, but it can be inefficient. If there are N numbers and Q queries, a linear search approach may require O(N × Q) operations. For small input sizes, that may pass. However, competitive programming problems often contain large test cases, and this method can lead to a time limit exceeded error.
For instance, if N is 100,000 and Q is also 100,000, the worst-case number of comparisons can reach 10,000,000,000. That is far too high for most online judge limits.
Best Approach: Use a Hash Set
The best common solution is to insert every number from the original list into a hash set. A hash set stores unique values and allows fast membership checking. In languages such as C++, this is usually done with unordered_set. In Java, it can be done with HashSet. In Python, the built-in set is used.
The logic is simple:
- Read the number of elements, N.
- Read all N integers.
- Insert each integer into a hash set.
- Read the number of queries, Q.
- For every query value, check whether it exists in the hash set.
- Print YES if it exists; otherwise, print NO.
This approach is efficient because insertion into a hash set takes expected O(1) time, and lookup also takes expected O(1) time. Therefore, the total expected time complexity becomes O(N + Q).
Algorithm Explanation
The algorithm can be described in a few clear steps. First, the program creates an empty hash set. Then it reads each element from the input array and stores it in that set. Duplicate values do not matter, because the question only asks whether a number is present, not how many times it appears.
After preprocessing, the program handles each query independently. For a query number x, it checks whether x is contained in the set. If the lookup succeeds, the number is present in the original list. If the lookup fails, the number was never inserted, so it is absent.
C++ Solution
#include <bits/stdc++.h>
using namespace std;
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n;
cin >> n;
unordered_set<int> numbers;
for (int i = 0; i < n; i++) {
int value;
cin >> value;
numbers.insert(value);
}
int q;
cin >> q;
while (q--) {
int x;
cin >> x;
if (numbers.find(x) != numbers.end()) {
cout << "YES\n";
} else {
cout << "NO\n";
}
}
return 0;
}
This code uses unordered_set<int> for fast lookup. The function find() returns an iterator to the element if it exists. If the element is not found, it returns numbers.end(). That condition is enough to decide the answer.
Complexity Analysis
The preprocessing step inserts N elements into the hash set. Since each insertion is expected constant time, this step takes O(N) time on average. Each of the Q queries is answered in expected O(1) time, so all queries together take O(Q).
Therefore, the expected total time complexity is:
O(N + Q)
The space complexity is O(N), because the hash set stores the input values. If duplicates exist, the actual number of stored elements may be smaller than N, but the upper bound remains O(N).
Alternative Approach: Sorting and Binary Search
Another valid method is to sort the array first and then use binary search for each query. Sorting takes O(N log N) time, and each query takes O(log N) time. The total complexity becomes O(N log N + Q log N).
This method is also reliable and may be preferred when a language’s hash table has unpredictable worst-case behavior. However, for typical HackerEarth constraints, the hash set solution is usually simpler and faster.
Common Mistakes
- Using linear search for every query: This can cause time limit exceeded errors for large inputs.
- Printing the wrong output format: The judge may require YES/NO, not Yes/No.
- Ignoring fast input: In C++, using
ios::sync_with_stdio(false)andcin.tie(nullptr)helps with large data. - Counting duplicates unnecessarily: The problem usually asks only for existence, so a set is enough.
FAQ
What is the main idea behind HackerEarth Find the Number?
The main idea is to check whether each queried number exists in a previously given list of numbers.
Which data structure is best for this problem?
A hash set is usually the best choice because it supports expected O(1) insertion and lookup.
Can binary search be used instead?
Yes. The array can be sorted, and each query can be answered using binary search. This gives O(N log N + Q log N) time complexity.
Does the solution need to handle duplicate numbers?
Duplicates do not affect the answer if the task only asks whether a number is present. A set automatically stores each distinct value once.
Why might a correct solution still fail?
A correct solution may fail if it uses a slow approach, prints the wrong format, or does not handle large input efficiently.
logo

