Skip to content

Conversation

@rockleona
Copy link

I started to have a research on computer vision, as the first step, this PR introduce Ultralytics YOLO as the object detection tool, and import yolov11 model as the base model.

Usage is really easy, just load an image, enable the detection, then you will see the result on the Console Widget. You can check the screenshot as below:
截圖 2025-10-29 23 16 42

@@ -0,0 +1,162 @@
/*
* Copyright (c) 2022, Yung-Yu Chen <[email protected]>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to your name if you are the first author.

QPushButton *toggleBtn = nullptr;
bool enableDetection = false;
};
std::unique_ptr<Impl> m_impl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to use pimpl idiom.


friend root_base_type;

WrapRVisionDockWidget(pybind11::module & mod, char const * pyname, char const * pydoc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is nothing, maybe we don't need this?

logger.setLevel(logging.DEBUG)

if 'model' not in globals():
model = YOLO('./modmesh/pilot/yolo11n.pt') # Please check the model path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you get the model? Maybe have a runtime download logic?

Copy link
Author

@rockleona rockleona Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will check the path if the model is exist during runtime, nor it will download it directly to the specified path.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found there is a directory called thirdparty, maybe I should specify the path overthere instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thirdparty is for the 3rd libraries. In this case, I think you can put the model file at the same directory of pilot runtime. Btw, it seems that the download logic is not implemented yet, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultralytics library had already done the download logic, no need to do it agin, perhaps they will find the model name is in their file server or not, then download it when trigger class YOLO initialization.

RVisionImage::RVisionImage(QWidget *parent)
: QWidget(parent)
{
m_image.load(":/default.jpg");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to put the image as an asset file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add an image in the repository, make sure it is small and uses a compatible license.

Alternately, load an image from online.

@@ -0,0 +1,99 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write tests to validate the implementation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you recommend to place the tests, put them in the tests/test_pilot.py perhaps?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can put it at tests/test_vision.py?

@yungyuc yungyuc added the pilot GUI and visualization label Oct 30, 2025
@yungyuc yungyuc marked this pull request as draft October 30, 2025 14:42
Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure CI passes before requesting for review.

  • Correct copyright headers.
  • Remove unnecessary code like WrapRVisionDockWidget, which looks like a placeholder.
  • The pimpl class RVisionDockWidget::Impl is not necessary. Do not use pimpl.
  • Do not create a symbol named modmesh::BoundingBox.
  • Always add an end marker to classes and namespaces.

I see you are using pybind11 to call back into Python to use YOLO. Why don't you just write PySide6 to do it?

};

class RVisionDockWidget;
class Impl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forward declaration Impl is no-op.

namespace modmesh
{

struct BoundingBox {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not name it as modmesh::BoundingBox, which will be used for other purposes.

Do not create additional namespace under modmesh. It will be over-engineering.

You may make it a nested class in RVisionDockWidget. A shorter name would make the code easier.

private:
QImage m_image;
std::vector<BoundingBox> m_boxes;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always keep an end marker:

...
}; /* end class RVisionImage */

RVisionImage::RVisionImage(QWidget *parent)
: QWidget(parent)
{
m_image.load(":/default.jpg");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add an image in the repository, make sure it is small and uses a compatible license.

Alternately, load an image from online.

const uchar *data = rgbImg.bits();
py::array_t<uint8_t> np_img({height, width, channels}, data);
py::object vision_mod = py::module_::import("modmesh.pilot._vision");
py::object yolo_func = vision_mod.attr("yolo_detect");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call YOLO from Python, why not use PySide?

@rockleona
Copy link
Author

Please make sure CI passes before requesting for review.

  • Correct copyright headers.
  • Remove unnecessary code like WrapRVisionDockWidget, which looks like a placeholder.
  • The pimpl class RVisionDockWidget::Impl is not necessary. Do not use pimpl.
  • Do not create a symbol named modmesh::BoundingBox.
  • Always add an end marker to classes and namespaces.

I see you are using pybind11 to call back into Python to use YOLO. Why don't you just write PySide6 to do it?

I thought it was a must to write all the GUI component with qt, I will change it to PySide6 since these functions were executed only from Python

@yungyuc
Copy link
Member

yungyuc commented Nov 17, 2025

@rockleona The code base has changed a lot. Please rebase to refresh the CI status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pilot GUI and visualization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants