Optimize astStringVerbose() for large arrays (#1815)

Change the astStringVerbose() recursion to extend a string instead of
returning one. This has the benefit that for tokens where the recursion
runs deep (typically large arrays), the time savings can be substantial
(see comments on benchmarks further down).

The reason is that previously, for each token, the astString of its
operands was constructed, and then appended to this tokens astString.
This led to a lot of unnecessary string copying (and with that
allocations). Instead, by passing the string by reference, the number
of temporary strings is greatly reduced.

Another way of seeing it is that previously, the string was constructed
from end to beginning, but now it is constructed from the beginning to
end. There was no notable speedup by preallocating the entire string
using string::reserve() (at least not on Linux).

To benchmark, the changes and master were tested on Linux using the
commands:

	make
	time cppcheck --debug --verbose $file >/dev/null

i.e., the cppcheck binary was compiled with the settings in the
Makefile. Printing the output to screen or file will of course take
longer time.

In Trac ticket #8355 which triggered this change, an example file from the
Wine repository was attached. Running the above cppcheck on master took
24 minutes and with the changes in this commmit, took 22 seconds.

Another test made was on lib/tokenlist.cpp in the cppcheck repo, which is
more "normal" file. On that file there was no measurable time difference.

A synthetic benchmark was generated to illustrate the effects on dumping
the ast for arrays of different sizes. The generate code looked as
follows:

	const int array[] = {...};

with different number of elements. The results are as follows (times are
in seconds):

	N	master optimized
	10	0.1    0.1
	100	0.1    0.1
	1000	2.8    0.7
	2000	19     1.8
	3000	53     3.8
	5000	350    10
	10000	3215   38

As we can see, for small arrays, there is no time difference, but for
large arrays the time savings are substantial.
This commit is contained in:
Rikard Falkeborn 2019-04-30 13:35:48 +02:00 committed by Daniel Marjamäki
parent b0baf4f65b
commit c7d7f8738c
3 changed files with 25 additions and 14 deletions

View File

@ -1365,7 +1365,7 @@ void Token::printAst(bool verbose, bool xml, std::ostream &out) const
astStringXml(tok, 2U, out); astStringXml(tok, 2U, out);
out << "</ast>" << std::endl; out << "</ast>" << std::endl;
} else if (verbose) } else if (verbose)
out << tok->astStringVerbose(0,0) << std::endl; out << tok->astStringVerbose() << std::endl;
else else
out << tok->astString(" ") << std::endl; out << tok->astString(" ") << std::endl;
if (tok->str() == "(") if (tok->str() == "(")
@ -1374,18 +1374,16 @@ void Token::printAst(bool verbose, bool xml, std::ostream &out) const
} }
} }
static std::string indent(const unsigned int indent1, const unsigned int indent2) static void indent(std::string &str, const unsigned int indent1, const unsigned int indent2)
{ {
std::string ret(indent1,' '); for (unsigned int i = 0; i < indent1; ++i)
str += ' ';
for (unsigned int i = indent1; i < indent2; i += 2) for (unsigned int i = indent1; i < indent2; i += 2)
ret += "| "; str += "| ";
return ret;
} }
std::string Token::astStringVerbose(const unsigned int indent1, const unsigned int indent2) const void Token::astStringVerboseRecursive(std::string& ret, const unsigned int indent1, const unsigned int indent2) const
{ {
std::string ret;
if (isExpandedMacro()) if (isExpandedMacro())
ret += '$'; ret += '$';
ret += mStr; ret += mStr;
@ -1395,16 +1393,26 @@ std::string Token::astStringVerbose(const unsigned int indent1, const unsigned i
if (mImpl->mAstOperand1) { if (mImpl->mAstOperand1) {
unsigned int i1 = indent1, i2 = indent2 + 2; unsigned int i1 = indent1, i2 = indent2 + 2;
if (indent1==indent2 && !mImpl->mAstOperand2) if (indent1 == indent2 && !mImpl->mAstOperand2)
i1 += 2; i1 += 2;
ret += indent(indent1,indent2) + (mImpl->mAstOperand2 ? "|-" : "`-") + mImpl->mAstOperand1->astStringVerbose(i1,i2); indent(ret, indent1, indent2);
ret += mImpl->mAstOperand2 ? "|-" : "`-";
mImpl->mAstOperand1->astStringVerboseRecursive(ret, i1, i2);
} }
if (mImpl->mAstOperand2) { if (mImpl->mAstOperand2) {
unsigned int i1 = indent1, i2 = indent2 + 2; unsigned int i1 = indent1, i2 = indent2 + 2;
if (indent1==indent2) if (indent1 == indent2)
i1 += 2; i1 += 2;
ret += indent(indent1,indent2) + "`-" + mImpl->mAstOperand2->astStringVerbose(i1,i2); indent(ret, indent1, indent2);
ret += "`-";
mImpl->mAstOperand2->astStringVerboseRecursive(ret, i1, i2);
} }
}
std::string Token::astStringVerbose() const
{
std::string ret;
astStringVerboseRecursive(ret);
return ret; return ret;
} }

View File

@ -1069,6 +1069,9 @@ private:
/** Update internal property cache about string and char literals */ /** Update internal property cache about string and char literals */
void update_property_char_string_literal(); void update_property_char_string_literal();
/** Internal helper function to avoid excessive string allocations */
void astStringVerboseRecursive(std::string& ret, const unsigned int indent1 = 0U, const unsigned int indent2 = 0U) const;
public: public:
void astOperand1(Token *tok); void astOperand1(Token *tok);
void astOperand2(Token *tok); void astOperand2(Token *tok);
@ -1118,7 +1121,7 @@ public:
return ret + sep + mStr; return ret + sep + mStr;
} }
std::string astStringVerbose(const unsigned int indent1, const unsigned int indent2) const; std::string astStringVerbose() const;
std::string expressionString() const; std::string expressionString() const;

View File

@ -6983,7 +6983,7 @@ private:
// Return stringified AST // Return stringified AST
if (verbose) if (verbose)
return tokenList.list.front()->astTop()->astStringVerbose(0, 0); return tokenList.list.front()->astTop()->astStringVerbose();
std::string ret; std::string ret;
std::set<const Token *> astTop; std::set<const Token *> astTop;