If you approach from the mathematics side of things, building on log₁₀ is the completely obvious approach to use. If it seems unintuitive, that’s just because you don’t understand logarithms.
> the autograding test cases did not include a test using a power of 10.
That’s a pretty glaring oversight. Boundary cases are probably the most important things to cover in such tests. I’d be wanting to test things like 0, 1, 9, 10, 11, 99, 100, 101, and so forth. (Also negative numbers, unless the problem explicitly constrained it to {0, 1, 2, …}.)
We often talk about edge cases in testing, but this reveals that the edges can be not just at the extremes, but also at intermediate value changes. Put another way (still fuzzy!), these are the interesting values. You test interesting values.
This would also be a good place to use property testing; instead of hard-coding a bunch of tests (or perhaps as well as), compare against a large number of generated values, comparing with a known-good implementation, probably just len(str(number)).
On another note, you have a very interesting website, I will be in touch at some point when I decide to tackle Rust
In a lot of cases, the most useful property is "for all inputs, the result is the right answer", but we don't know the right answer until we've written the algorithm, and there's not usually much point writing two algorithms unless, like here, one of those algorithms is trivial correct but inappropriate for some reason.
More generally, I feel like the more trivial a property is to check, the simpler the code to implement it is, and the less useful the test is in general. In the extreme case, a getter/setter pair is very easy to test with PBT, but you rarely need to be testing your getters and setters.
The goal is to learn, and the point of the exercises is to teach a specific concept. If a student finds a different way around the problem, that may show that they're already proficient in other skills, but they haven't necessarily learned the concept being taught in this class yet. A good instructor would probably acknowledge the solution, but add extra boundaries to the task to get the student to explore the problem in a way that lets them encounter the testing difficulties discussed here.
It's like smuggling a calculator into a class about mental maths strategies: you'll probably do very well in the final test, but you won't have learned anything!
int numDigits(int num){
if (num < 0){
return numDigits(-1 * num);
}
if (num < 10){
return 1;
}
if (num < 100){
return 2;
}
if (num < 1000){
return 3;
}
if (num < 10000){
return 4;
}
if (num < 100000){
return 5;
}
if (num < 1000000){
return 6;
}
if (num < 10000000){
return 7;
}
if (num < 100000000){
return 8;
}
if (num < 1000000000){
return 9;
}
return 10; // cant be more than 10 since sizeof(int) == 4, otherwise extend to 19
}If the number is less than 10, return one, otherwise is it less than 100, ...
You should prefer the simplest tool that can do the job. Multiplication is simpler than division. You could digress about floating point errors (Python has infinite-precision integers, but not infinite- precision floats) or performance (division may be in some contexts meaningfully slower) but those are just digressions, the important thing is that it's more complicated.
The edge cases also seem completely clear to me in this approach, you won't accidentally write <= 100, and 0 is handled automatically.
Of course, the division and log and string methods are all fine in practice, but I'm surprised not to see this alternative mentioned.
My Favorite Programming Problem to Teach: Digit Length - >>21500434 - Nov 2019 (106 comments)
I hope the author warned the students about floating point imprecision :P
Though powers of 10 is a boundary condition so would be one of the first things I test against
return 1 + (n >= 10) + (n >= 100) + (n >= 1e3) + (n >= 1e4) + (n >= 1e5) + (n >= 1e6) + (n >= 1e7) + (n >= 1e8) + (n >= 1e9) ... perl -e 'print length 987654'
6
Similarly easy in Lisp with either of write-to-string, prin1-to-string, and princ-to-string. I expect python to have some such built-in function too.I did this which will work with any length of number[1], and appears to work for all edge cases, including numbers that start with zero and doesn't use loops:
(defun num-digits (n)
(if (< n 10)
1
(+ 1 (num-digits (floor n 10)))))
[1] Millions of digits, if your computer has the RAM for it.I consider that to be wrong - leading zeros are not part of the number and should be ignored. Using `len(str(x))` results in `2` for the input `"01"`.
i.e. I know this super complex answer in which is it happens to be the only purely accurate thing. Can you guess it? vs what I would like to see which is: here are the foundational concepts, bring them all together and arrive at the naturally correct conclusion demonstrating your knowledge and understanding
math.ceil(math.log10(x+1)) def numDigits(num):
return len(str(num))
print (numDigits(1)) # Returns the correct answer
print (numDigits("1")) # Returns the correct answer
print (numDigits("01")) # Returns the incorrect answer
The "returns the wrong answer" is the problem. If the input is the wrong type, I expect there to be an error raised, not silently give me wrong answers.The other solutions fail differently - they generates an error and stop processing.
The `len(str())` failure doesn't generate an error, and doesn't stop processing, it simply returns the wrong answer.
So, the `len` approach is wrong, the other approaches are right.
This post was written by an undergraduate student.
A professor would use that as a teaching opportunity to discuss the folly of relying on floating point numbers.
That's because the teacher here is an undergraduate student who has yet to learn from painful experience :)
sum(n>=10**i for i in range(100))
for (for example) inputs up to a googol. Or with your fix for the base case, 1 + sum(n>=10**i for i in range(1, 100))
Another cute way that hides the loop from itertools import count, takewhile
1 + len(list(takewhile(lambda x: 10**x <= n, count(1))))Yes, we obviously normally use a single digit to write zero, but we could logically write it as the empty string "" if we had a way to indicate that a specific number was meant. (Relative to that, writing 0 is just as unnecessary as writing 00 or 000, because we can have a rule that all leading zeroes need not be written.)
As another example of this intuition, suppose that we wrote all integers with a trailing decimal point. In that case the numbers from 1 to 10 would be "1.", "2.", "3.", "4.", "5.", "6.", "7.", "8.", "9.", and "10.", while zero could quite logically just be "." with no digits (as it has only leading zeroes, or we could say "nothing in the ones' place").
Quite a lot of arithmetic algorithms would probably work just fine with these conventions! In fact, they might have fewer exceptions or special cases than they do when they expect an explicit 0.
For instance, using the "zero is written as empty string" convention, Python int() (here simplified to assume input strings of zero or more decimal digits) and str() could look like
def int(s):
n = 0
for c in s:
n *= 10
n += "0123456789".find(c)
return n
def str(n):
s = ""
while n:
n, d = divmod(n, 10)
s = "0123456789"[d] + s
return s
Seems pretty general! Yes, it's annoying or confusing if the output is going to be viewed by a human user in a larger string context without delimiters, as we probably don't want to see things like "I ate tomatoes" instead of "I ate 0 tomatoes".Certainly we can use the empty string to denote the number zero, but it doesn’t mean that we have a straight forward convention for how we denote any number which have some null value in some power lower than the most significant one.
Not that it’s impossible to come with some convention that ditches 0 as intermediary digit. For example 302009==3e5+2e3+9 will evaluate to true in many languages out there. In Ruby we even have `302009 === 3e5+2e3+9+''.to_i` that is evaluated to true.
But this notation loses the conveniences afforded by a positional fixed base numeral system provides.
1+"1"
[]+{}
({}).foo
[]/2
Of course there's the type checkers for python like mypy and pyright but in my experience the type systems these provide are wildly underpowered for statically typing even slightly more complex idiomatic python.Re the num digits example: One way to go about this is to add asserts for checking the input types, though generally in python it's considered more idiomatic to check how an object behaves than what type it is.
In any case in practice I find that by far most type systems, bolted on or built in or otherwise, are more of a hassle than they are worth. One of the few exceptions I've found so far is D. When you rewrite the num digits function in D such that it accepts any type like the python example does, it should be of no surprise that it has the exact same bug as the python version:
int numDigits(T)(T num){
return num.to!string.length
}
I'm not entirely sure what my point is other than to point out that typing in python isn't an afterthought. it just uses a different type system to what you're used to which allows for bugs you're not used to. hlen:
// int hlen(uint64_t x)
// takes an integer, returns length of hex string
mov x9, #16 // max width
clz x0, x0 // count binary leading zeros
cmp x0, #64
lsr x0, x0, #2 // 4 bits per hex digit
sub x0, x9, x0
cinc x0, x0, eq // if num = 0, add 1
ret
Decimal requires looping (well, only ~20 comparisons needed for 64-bit so maybe that could be unrolled but same thing either way), so it's simpler to just use high level.The 'clever' solution fails miserably on value zero and needs hard-coding to handle it. That looks like using the wrong tool.
As for the handling of zero, well, that’s not about mathematics or about how a computer stores or represents data—that’s about how humans represent data. We choose to special-case zero, allowing it to have a leading zero which we never allow for anything else. All solutions in software will need to special-case zero.
import itertools
def digitLength(n):
if n == 0:
return 1
*_, last = itertools.takewhile(lambda acc: acc[0] != 0, itertools.accumulate(itertools.repeat(None), lambda acc, x: (acc[0]//10, acc[1] + 1), initial=(n, 0)))
return last[1]
the problem is interesting in python because i think the looping solution is not optimal because it performs N divisions for an arbitrarily large integer whereas I think there should be a solution that performs O(log(N)) multiplications for an arbitrarily large integer. in other languages with fixed integers its not really an issue how many operations you do since its effectively constant.We don't want generated comments on HN, or even human-generated summaries. They are too generic to produce interesting conversation, besides which we want people to actually look at, and perhaps even read, articles.
Fortunately your earlier comments look fine so this should be easy to fix!
"", "1", "2", "3", "4", ..., "9", "10", "11", "12", ..., "99", "100", "101", ...
The only change is the empty string being a valid name for a number (the number zero).
I don't mean to suggest getting rid of place value or the digit 0! This is more like making the place value system more consistent with respect to a corner case.
And the practical reason that we can't make this change is that we don't have a way to distinguish in writing between the absence of any string and the presence of the empty string.
def noloops(n):
x = lambda n, digits: (n / 1e9, digits + (n >= 10) + (n >= 100) + (n >= 1e3) + (n >= 1e4) + (n >= 1e5) + (n >= 1e6) + (n >= 1e7) + (n >= 1e8) + (n >= 1e9))
y = lambda n, digits: x(*x(*x(*x(*x(*x(n, digits))))))
return y(*y(*y(*y(*y(*y(n, 1))))))[1]
Now we have it up to 324 digits without any loops.