Skip to content

Updated ruleBlockIfNoExit to better select branch if both are noexit. #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/block.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1323,6 +1323,20 @@ void BlockGraph::restoreXmlBody(List::const_iterator &iter,List::const_iterator
}
}

int4 BlockGraph::getInnerBlockDepth(void)

{
int4 depth;
int4 maxDepth = 0;
for(int4 i=0;i<list.size();++i){
depth = list[i]->getBlockDepth();
if(depth>maxDepth){
maxDepth=depth;
}
}
return maxDepth;
}

/// This is currently just a wrapper around the FlowBlock::restoreXml()
/// that sets of the BlockMap resolver
/// \param el is the root \<block> tag
Expand Down Expand Up @@ -2505,6 +2519,12 @@ bool BlockBasic::isDoNothing(void) const
return hasOnlyMarkers();
}

int4 BlockBasic::getOpSize(void)

{
return op.size();
}
Comment on lines +2522 to +2526

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is no check to ensure that op is not empty before calling op.size(). This could lead to a crash if the block has no operations. While this might be an uncommon scenario, it's good to add a check for robustness.

Suggested change
int4 BlockBasic::getOpSize(void)
{
return op.size();
}
{
if (op.empty()) {
return 0; // Or some other appropriate value
}
return op.size();
}


/// In terms of machine instructions, a basic block always covers a range of addresses,
/// from its first instruction to its last. This method establishes that range.
/// \param beg is the address of the first instruction in the block
Expand Down Expand Up @@ -2747,6 +2767,13 @@ FlowBlock *BlockList::getSplitPoint(void)
return getBlock(getSize()-1)->getSplitPoint();
}

int4 BlockList::getBlockDepth(void)

{
// list join block together but don't increase block depth
return getInnerBlockDepth();
}

void BlockList::printHeader(ostream &s) const

{
Expand Down Expand Up @@ -2831,6 +2858,13 @@ void BlockCondition::saveXmlHeader(ostream &s) const
a_v(s,"opcode",nm);
}

int4 BlockCondition::getBlockDepth(void)

{
// conditions join block together but don't increase block depth
return getInnerBlockDepth();
}

void BlockIf::markUnstructured(void)

{
Expand Down Expand Up @@ -3185,6 +3219,18 @@ FlowBlock *BlockSwitch::nextFlowAfter(const FlowBlock *bl) const
return getParent()->nextFlowAfter(this);
}

int4 BlockSwitch::getBlockDepth(void){
int4 i;
int4 maxDepth=0;
for(i=0;i<caseblocks.size();++i){
int4 depth=caseblocks[i].block->getBlockDepth();
if(depth>maxDepth){
maxDepth=depth;
}
}
return maxDepth+2; // +1 for switch block and +1 for case/default block

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The calculation maxDepth + 2 could potentially lead to an integer overflow if maxDepth is sufficiently large. While the likelihood of this occurring in practice might be low, it's worth considering adding a check to prevent this.

Suggested change
return maxDepth+2; // +1 for switch block and +1 for case/default block
{
if (maxDepth > INT_MAX - 2) { // INT_MAX is defined in <limits.h>
return INT_MAX; // Or handle the overflow in some other way
}
return maxDepth + 2; // +1 for switch block and +1 for case/default block
}

}

BlockMap::BlockMap(const BlockMap &op2)

{
Expand Down
7 changes: 7 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/block.hh
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ public:
/// \param enditer marks the end of the XML tags
/// \param resolver is used to recover FlowBlock objects based on XML references
virtual void restoreXmlBody(List::const_iterator &iter,List::const_iterator enditer,BlockMap &resolver) {}
virtual int4 getBlockDepth(void) {return 0;} ///< Return the depth in code block of \b this
void saveXmlEdges(ostream &s) const; ///< Save edge information to an XML stream
Comment on lines +184 to 185
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Declare getBlockDepth() as const and mark overrides explicitly

getBlockDepth() is a query that does not mutate state, yet it’s non-const in the base class and all overrides.
Keeping it non-const prevents calling it on const FlowBlock objects, forces const_casts elsewhere, and diverges from the surrounding accessors (e.g. getType()).

-  virtual int4 getBlockDepth(void) {return 0;}        ///< Return the depth in code block of \b this
+  virtual int4 getBlockDepth(void) const { return 0; } ///< Return the depth in code block of \b this

Then, in every derived class add the const qualifier and the override specifier, e.g.

-  virtual int4 getBlockDepth() { return getInnerBlockDepth() + 1; }
+  int4 getBlockDepth() const override { return getInnerBlockDepth() + 1; }

Besides catching accidental signature mismatches at compile-time, this makes the intent explicit and enables further optimisation by the compiler.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
virtual int4 getBlockDepth(void) {return 0;} ///< Return the depth in code block of \b this
void saveXmlEdges(ostream &s) const; ///< Save edge information to an XML stream
virtual int4 getBlockDepth(void) const { return 0; } ///< Return the depth in code block of \b this
void saveXmlEdges(ostream &s) const; ///< Save edge information to an XML stream

void restoreXmlEdges(List::const_iterator &iter,List::const_iterator enditer,BlockMap &resolver);
void saveXml(ostream &s) const; ///< Write out \b this to an XML stream
Expand Down Expand Up @@ -299,6 +300,8 @@ public:
virtual void finalizePrinting(const Funcdata &data) const;
virtual void saveXmlBody(ostream &s) const;
virtual void restoreXmlBody(List::const_iterator &iter,List::const_iterator enditer,BlockMap &resolver);
virtual int4 getInnerBlockDepth(); ///< Return max depth of child blocks
virtual int4 getBlockDepth() {return getInnerBlockDepth()+1;}
Comment on lines +303 to +304
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Propagate const/override to all implementations

The newly added overrides in BlockGraph, BlockList, BlockCondition, and BlockSwitch follow the non-const prototype.
After adjusting the base declaration, remember to update these:

-virtual int4 getInnerBlockDepth();
-int4 getBlockDepth() { return getInnerBlockDepth() + 1; }
+int4 getInnerBlockDepth() const;
+int4 getBlockDepth() const override { return getInnerBlockDepth() + 1; }

and similarly for the other three classes.

Failing to add override can silently create a new unrelated virtual instead of overriding, should the signature drift in the future.

Also applies to: 508-509, 538-539, 684-685

void restoreXml(const Element *el,const AddrSpaceManager *m); ///< Restore \b this BlockGraph from an XML stream
void addEdge(FlowBlock *begin,FlowBlock *end); ///< Add a directed edge between component FlowBlocks
void addLoopEdge(FlowBlock *begin,int4 outindex); ///< Mark a given edge as a \e loop edge
Expand Down Expand Up @@ -401,6 +404,7 @@ public:
list<PcodeOp *>::const_iterator beginOp(void) const { return op.begin(); } ///< Return an iterator to the beginning of the PcodeOps
list<PcodeOp *>::const_iterator endOp(void) const { return op.end(); } ///< Return an iterator to the end of the PcodeOps
bool emptyOp(void) const { return op.empty(); } ///< Return \b true if \b block contains no operations
int4 getOpSize(void); ///< Number of PcodeOps contained in \b this block
static bool noInterveningStatement(PcodeOp *first,int4 path,PcodeOp *last);
Comment on lines +407 to 408
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

BlockBasic::getOpSize() should be const

The helper only inspects the op list; making it const allows calling it on const-qualified FlowBlock pointers (as done in selectBestNoExit) without a const_cast.

-int4 getOpSize(void);
+int4 getOpSize(void) const;

Remember to update the definition in block.cc.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
int4 getOpSize(void); ///< Number of PcodeOps contained in \b this block
static bool noInterveningStatement(PcodeOp *first,int4 path,PcodeOp *last);
int4 getOpSize(void) const; ///< Number of PcodeOps contained in \b this block
static bool noInterveningStatement(PcodeOp *first,int4 path,PcodeOp *last);

};

Expand Down Expand Up @@ -501,6 +505,7 @@ public:
virtual PcodeOp *lastOp(void) const;
virtual bool negateCondition(bool toporbottom);
virtual FlowBlock *getSplitPoint(void);
virtual int4 getBlockDepth(void);
};

/// \brief Two conditional blocks combined into one conditional using BOOL_AND or BOOL_OR
Expand Down Expand Up @@ -530,6 +535,7 @@ public:
virtual bool isComplex(void) const { return getBlock(0)->isComplex(); }
virtual FlowBlock *nextFlowAfter(const FlowBlock *bl) const;
virtual void saveXmlHeader(ostream &s) const;
virtual int4 getBlockDepth(void);
};

/// \brief A basic "if" block
Expand Down Expand Up @@ -675,6 +681,7 @@ public:
virtual void emit(PrintLanguage *lng) const { lng->emitBlockSwitch(this); }
virtual FlowBlock *nextFlowAfter(const FlowBlock *bl) const;
virtual void finalizePrinting(const Funcdata &data) const;
virtual int4 getBlockDepth(void);
};

/// \brief Helper class for resolving cross-references while deserializing BlockGraph objects
Expand Down
54 changes: 48 additions & 6 deletions Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1463,6 +1463,7 @@ bool CollapseStructure::ruleBlockIfNoExit(FlowBlock *bl)
{
FlowBlock *clauseblock;
int4 i;
int4 bestIndex=-1;

if (bl->sizeOut() != 2) return false; // Must be binary condition
if (bl->isSwitchOut()) return false;
Expand All @@ -1480,15 +1481,56 @@ bool CollapseStructure::ruleBlockIfNoExit(FlowBlock *bl)
// bl->setGotoBranch(i);
// return true;
// }
if (bestIndex==-1){
bestIndex=i;
}else{ // both match
bestIndex = selectBestNoExit(bl->getOut(0),bl->getOut(1));
}
}
if(bestIndex==-1) return false; // no match
clauseblock = bl->getOut(bestIndex);
if (bestIndex==0) { // clause must be true out of bl
if (bl->negateCondition(true))
dataflow_changecount += 1;
}
graph.newBlockIf(bl,clauseblock);
return true;
}

if (i==0) { // clause must be true out of bl
if (bl->negateCondition(true))
dataflow_changecount += 1;
/// Select the best of two NoExit branch to be collapsed by ruleBlockIfNoExit.
/// \param clause0 is the first NoExit branch
/// \param clause1 is the second NoExit branch
/// \return the index of the selected branch (0 or 1)
int4 CollapseStructure::selectBestNoExit(FlowBlock *clause0,FlowBlock *clause1)

{
// select lowest block depth
int4 depth0 = clause0->getBlockDepth();
int4 depth1 = clause1->getBlockDepth();
if (depth0<depth1)return 0;
if (depth1<depth0)return 1;

// same depth, prefer non return
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;

// prefer block containing only return op
if(isRet0){ // both are return
FlowBlock* fb;
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
}
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
}
graph.newBlockIf(bl,clauseblock);
return true;
}
return false;

// fall back to previous behavior
return 0;
}
Comment on lines +1504 to 1534
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

selectBestNoExit – unchecked casts and missing headers

  1. static_cast<BlockCopy*>(clauseX) is used without verifying that clauseX really is a t_copy. If an unforeseen FlowBlock::block_type slips through, the cast is UB. Add a guard:
if (clause0->getType() == FlowBlock::t_copy) {
  auto *cpy = static_cast<BlockCopy*>(clause0);
  ...
}
  1. BlockCopy / BlockBasic are referenced but block.hh is not included in blockaction.cc. Today this works only because a chain of includes drags the header in indirectly; add an explicit include to avoid accidental breakage:
#include "block.hh"
  1. Consider early-returning when the depths differ to avoid computing return properties needlessly.

  2. A nit: int4 depth0 = ... can under-flow if getBlockDepth() ever returns -1 to signal “unknown”. Guard with assert(depth >= 0) or treat -1 as “worst” to prevent negative comparison surprises.


/// Try to find a while/do structure, starting with a given FlowBlock.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ class CollapseStructure {
bool ruleCaseFallthru(FlowBlock *bl); ///< Attempt to one switch case falling through to another
int4 collapseInternal(FlowBlock *targetbl); ///< The main collapsing loop
void collapseConditions(void); ///< Simplify conditionals
int4 selectBestNoExit(FlowBlock *clause0,FlowBlock *clause1);
public:
CollapseStructure(BlockGraph &g); ///< Construct given a control-flow graph
int4 getChangeCount(void) const { return dataflow_changecount; } ///< Get number of data-flow changes
Expand Down
2 changes: 2 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/op.hh
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,8 @@ public:
bool isCallOrBranch(void) const { return ((flags&(PcodeOp::branch|PcodeOp::call))!=0); }
/// \brief Return \b true if this op breaks fall-thru flow
bool isFlowBreak(void) const { return ((flags&(PcodeOp::branch|PcodeOp::returns))!=0); }
/// \brief Return \b true if this op will be decompiled as "return"
bool isStandardReturn(void) const { return ((flags&PcodeOp::returns)!=0 && getHaltType()==0); }
/// \brief Return \b true if this op flips the true/false meaning of its control-flow branching
bool isBooleanFlip(void) const { return ((flags&PcodeOp::boolean_flip)!=0); }
/// \brief Return \b true if the fall-thru branch is taken when the boolean input is true
Expand Down