Problem 105: Special subset sums: testing

(see projecteuler.net/problem=105)

Let S(A) represent the sum of elements in set A of size n. We shall call it a special sum set if for any two non-empty disjoint subsets,
B and C, the following properties are true:

i. S(B) != S(C); that is, sums of subsets cannot be equal.
ii. If B contains more elements than C then S(B) > S(C).

For example, { 81, 88, 75, 42, 87, 84, 86, 65 } is not a special sum set because 65 + 87 + 88 = 75 + 81 + 84,
whereas { 157, 150, 164, 119, 79, 159, 161, 139, 158 } satisfies both rules for all possible subset pair combinations and S(A) = 1286.

Using sets.txt (right click and "Save Link/Target As..."), a 4K text file with one-hundred sets containing seven to twelve elements (the two examples given above are the first two sets in the file), identify all the special sum sets, A1, A2, ..., Ak, and find the value of S(A1) + S(A2) + ... + S(Ak).

NOTE: This problem is related to Problem 103 and Problem 106.

Algorithm

check is pretty much the same code as in problem 103. See there for an explanation.

I spent most of the time writing code to parse the input (and did it twice because Project Euler has a CSV format while Hackerrank a simpler space-separated).

Modifications by HackerRank

The set can have up to 100 elements. That's 2^100 subsets ... my code can't handle that.
I always return "NO" whenever a set more 30 or more elements to avoid timeouts. Fun fact: that's the correct result for all tests.

My code

… was written in C++11 and can be compiled with G++, Clang++, Visual C++. You can download it, as well as the input data, too.

The code contains #ifdefs to switch between the original problem and the Hackerrank version.
Enable #ifdef ORIGINAL to produce the result for the original problem (default setting for most problems).

//#define ORIGINAL
 
#include <iostream>
#include <vector>
 
typedef std::vector<unsigned int> Sequence;
 
// return true if sequence is special
bool check(const Sequence& sequence)
{
// sum of all elements
unsigned int fullSum = 0;
for (auto x : sequence)
fullSum += x;
 
// mark each generated sum as true, no collisions allowed
std::vector<bool> sums(fullSum + 1, false);
 
// track the lowest and highest sum for each subset size
std::vector<unsigned int> maxSum(sequence.size() + 1, 0);
std::vector<unsigned int> minSum(sequence.size() + 1, fullSum + 1);
minSum[0] = maxSum[0] = 0; // empty set
 
unsigned int fullMask = (1 << sequence.size()) - 1;
 
// 2^elements iterations (actually, I ignore the empty set)
for (unsigned int mask = 1; mask <= fullMask; mask++)
{
unsigned int sum = 0;
unsigned int size = 0;
for (unsigned int element = 0; element < sequence.size(); element++)
{
// use that element ?
unsigned int bit = 1 << element;
if ((mask & bit) == 0)
continue;
 
sum += sequence[element];
// count subset size
size++;
}
 
// two subsets share the same sum ?
if (sums[sum])
return false;
sums[sum] = true;
 
// adjust lowest and highest sum of current subset
if (minSum[size] > sum)
minSum[size] = sum;
if (maxSum[size] < sum)
maxSum[size] = sum;
}
 
// make sure that no set will fewer elements has a higher sum
for (size_t i = 1; i < sequence.size(); i++)
if (maxSum[i] > minSum[i + 1])
return false;
 
// yes, have another solution
return true;
}
 
#ifdef ORIGINAL
 
// convert a line of Project Euler's format into a sequence
Sequence readLine()
{
Sequence result;
while (true)
{
result.push_back(0);
 
char oneByte = 0;
while (true)
{
oneByte = std::cin.get();
 
// end of file ?
if (!std::cin)
return result;
 
// not a digit ?
if (oneByte < '0' || oneByte > '9')
break;
 
// append digit
result.back() *= 10;
result.back() += oneByte - '0';
}
 
// end of line
if (oneByte != ',')
break;
}
return result;
}
 
#else
 
// convert a line of Project Euler's format into a sequence
Sequence readLine()
{
// read number elements
unsigned int size;
std::cin >> size;
 
// read elements
Sequence result(size);
for (auto& x : result)
std::cin >> x;
 
return result;
}
 
#endif
 
int main()
{
unsigned int tests = 100;
#ifdef ORIGINAL
unsigned int sum = 0;
#else
std::cin >> tests;
#endif
 
while (tests--)
{
auto sequence = readLine();
 
#ifdef ORIGINAL
// special ?
if (check(sequence))
// yes !
for (auto x : sequence)
sum += x;
#else
// special ?
if (sequence.size() < 30) // 2^30 has already one billion subsets ...
std::cout << (check(sequence) ? "YES" : "NO") << std::endl;
else
std::cout << "NO" << std::endl; // just make a guess
#endif
}
 
#ifdef ORIGINAL
std::cout << sum << std::endl;
#endif
 
return 0;
}

This solution contains 28 empty lines, 22 comments and 13 preprocessor commands.

Interactive test

You can submit your own input to my program and it will be instantly processed at my server:

This live test is based on the Hackerrank problem.

Number of test cases (1-5):

Input data (separated by spaces or newlines):
Note: Enter the size of the set and then its members

This is equivalent to
echo "2 8 81 88 75 42 87 84 86 65 9 157 150 164 119 79 159 161 139 158" | ./105

Output:

(please click 'Go !')

(this interactive test is still under development, computations will be aborted after one second)

Benchmark

The correct solution to the original Project Euler problem was found in less than 0.01 seconds on a Intel® Core™ i7-2600K CPU @ 3.40GHz.
(compiled for x86_64 / Linux, GCC flags: -O3 -march=native -fno-exceptions -fno-rtti -std=c++11 -DORIGINAL)

See here for a comparison of all solutions.

Note: interactive tests run on a weaker (=slower) computer. Some interactive tests are compiled without -DORIGINAL.

Changelog

May 16, 2017 submitted solution
May 16, 2017 added comments

Hackerrank

see https://www.hackerrank.com/contests/projecteuler/challenges/euler105

My code solved 17 out of 17 test cases (score: 100%)

Difficulty

Project Euler ranks this problem at 45% (out of 100%).

Hackerrank describes this problem as easy.

Note:
Hackerrank has strict execution time limits (typically 2 seconds for C++ code) and often a much wider input range than the original problem.
In my opinion, Hackerrank's modified problems are usually a lot harder to solve. As a rule thumb: brute-force is never an option.

Links

projecteuler.net/thread=105 - the best forum on the subject (note: you have to submit the correct solution first)

Code in various languages:

Python: www.mathblog.dk/project-euler-105-sum-special-sum-sets-file/ (written by Kristian Edlund)
Scala: github.com/samskivert/euler-scala/blob/master/Euler105.scala (written by Michael Bayne)

Heatmap

green problems solve the original Project Euler problem and have a perfect score of 100% at Hackerrank, too.
yellow problems score less than 100% at Hackerrank (but still solve the original problem).
gray problems are already solved but I haven't published my solution yet.
blue problems are already solved and there wasn't a Hackerrank version of it (at the time I solved it) or I didn't care about it because it differed too much.

Please click on a problem's number to open my solution to that problem:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200
The 133 solved problems had an average difficulty of 16.9% at Project Euler and I scored 11,174 points (out of 12300) at Hackerrank's Project Euler+.
more about me can be found on my homepage.
some names mentioned on this site may be trademarks of their respective owners.
thanks to the KaTeX team for their great typesetting library !