Skip to content

PM_DEVICE_POWER_DOMAIN refactor #89372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

bjarki-andreasen
Copy link
Collaborator

@bjarki-andreasen bjarki-andreasen commented May 1, 2025

This PR refactors PM_DEVICE_POWER_DOMAIN to be simpler and more scalable, moving power domain and device constraint logic from device drivers to PM_DEVICE_RUNTIME.

Depends on and cherry picks the following:

decsny
decsny previously requested changes May 2, 2025
Copy link
Member

@decsny decsny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks really good to me. The change request is specifically over some DT related nits. And I have one question about one of the other power domain drivers which is not being touched on this PR at the moment (power_domain_soc_state_change). Because that one in particular seems like the biggest refactor target of any of the existing in tree ones, given the goal of this PR is to remove code about calling turn on/off actions on domain children. That's basically all that driver's code seems to be for from what I can tell.

Comment on lines 12 to 19
/* Disable unused nodes on power_mode3_domain */
&dma0 {
status = "disabled";
};

&flexcomm1 {
status = "disabled";
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is fine, and it's maybe a bigger ask from this PR but since this RW is one of the few in tree users of power domain, would it make sense on this PR to fix that soc since the power domain is described incorrectly right now?

If not I guess nxp maintainers will fix it later.. but to be clear what I'm seeing is maybe an issue on that platform which I was not aware of before because I was not involved here, is that PM3 on that chip is not a domain but a power state, and should not be described as a domain? I'm not sure about this though because I am actually basically just now getting called in to get a handle on the power architecture of Zephyr due to problems we are having with this specific chip in Zephyr when users enable CONFIG_PM.... apparently the current state of things is not working out

It seems like instead there should be a domain representing the digital region of the chip that I think should use the new default driver you are adding in this PR? And that domain should mark the zephyr suspend (called pm2 on chip) and standby (called pm3 on the chip) states as the states that disable the power domain in DT? And all of these peripherals in this overlay being disabled here are part of that domain (except the analog dac and adc, but for now we can pretend they are at minimum because the digital domain is the most affected by power state transitions on this chip)

I actually think this current driver we have written here for our current conception of the "power domain" is maybe completely backwards and maybe not even valid due to it is representing a specific mode transition and not a physical (or conceptual) domain/power region? https://github.com/zephyrproject-rtos/zephyr/blob/main/drivers/power_domain/power_domain_soc_state_change.c . I noticed also that you refactored some of the other power domain drivers regarding this children handling on/off due to the redudancy/reimplementing. I think maybe this particular driver I linked doesn't even need to exist now.

I understand if you don't want to touch that power domain driver I linked on this PR, since I think it's kind of a vendor specific mess, even though it looks it was given a generic zephyr compatible, but at least if you have any guidance or clarity about this at least I will greatly appreciate it.

Comment on lines 762 to 779
out_dt_define(f"{node.z_path_id}_POWER_DOMAIN_CHILDREN_LEN_OKAY", len(pd_child_z_path_ids))
out_dt_define(f"{node.z_path_id}_POWER_DOMAIN_CHILDREN_FOREACH_OKAY(fn)",
" ".join(
f"fn(DT_{z_path_id})"
for z_path_id in pd_child_z_path_ids))
out_dt_define(f"{node.z_path_id}_POWER_DOMAIN_CHILDREN_FOREACH_OKAY_SEP(fn, sep)",
" DT_DEBRACKET_INTERNAL sep ".join(
f"fn(DT_{z_path_id})"
for z_path_id in pd_child_z_path_ids))
out_dt_define(f"{node.z_path_id}_POWER_DOMAIN_CHILDREN_FOREACH_OKAY_VARGS(fn, ...)",
" ".join(
f"fn(DT_{z_path_id}, __VA_ARGS__)"
for z_path_id in pd_child_z_path_ids))
out_dt_define(f"{node.z_path_id}_POWER_DOMAIN_CHILDREN_FOREACH_OKAY_SEP_VARGS(fn, sep, ...)",
" DT_DEBRACKET_INTERNAL sep ".join(
f"fn(DT_{z_path_id}, __VA_ARGS__)"
for z_path_id in pd_child_z_path_ids))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this addition should warrant update the macros.bnf description for this change similar to like pinctrl has done for example

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added entry to macros.bnf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what vnd,reserved-compat is specifically for already, so this new one shouldn't be needed

Copy link
Collaborator Author

@bjarki-andreasen bjarki-andreasen May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, changed to use existing binding

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vnd,reserved-node node is specifically used to test devices with status reserved :) Added back the device-base one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did you see that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zassert_false(DT_HAS_COMPAT_STATUS_OKAY(vnd_reserved_compat), "");

Comment on lines 2948 to 2961
* @brief Variant of DT_POWER_DOMAIN_CHILDREN_FOREACH_OKAY with VARGS
*
* @see DT_POWER_DOMAIN_CHILDREN_FOREACH_OKAY()
*
* @details The macro @p fn must take one parameter, which will be a node
* identifier, followed by variadic args. The macro is expanded once for
* every power domain child with status "okay".
*
* @param node_id Power domain's node identifier
* @param fn Macro to invoke
* @param ... Variadic args passed to @p fn
*/
#define DT_POWER_DOMAIN_CHILDREN_FOREACH_OKAY_VARGS(node_id, fn, ...) \
DT_CAT(node_id, _POWER_DOMAIN_CHILDREN_FOREACH_OKAY_VARGS)(fn, __VA_ARGS__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other subsystems put the DT macros in include/zephyr/devicetree/ subfolder with their own header, can't this go that route as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to its own header

@decsny
Copy link
Member

decsny commented May 2, 2025

And one more question , is are you targeting this PR to merge before 4.2 freeze point?

@bjarki-andreasen
Copy link
Collaborator Author

bjarki-andreasen commented May 2, 2025

@decsny I'm all for refactoring the soc_state_change power domain and the board using it as part of this PR, it is indeed "backwards" compared to the power domain behavior I envision with this PR. I just did not want to "rock the boat" too much at once :)

@bjarki-andreasen
Copy link
Collaborator Author

And one more question , is are you targeting this PR to merge before 4.2 freeze point?

I'm working with an "as soon as possible" timeline to refactor pm device in general, so probably yes :D

@decsny
Copy link
Member

decsny commented May 28, 2025

@JordanYates is your change request still relevant? This PR makes a lot of sense to me in it's current state, I would actually like to support it's merge sooner than later, so do we need an arbitration?

@bjarki-andreasen I'm not sure what are the twister failures here but is this PR finished otherwise?

@bjarki-andreasen
Copy link
Collaborator Author

@JordanYates is your change request still relevant? This PR makes a lot of sense to me in it's current state, I would actually like to support it's merge sooner than later, so do we need an arbitration?

@bjarki-andreasen I'm not sure what are the twister failures here but is this PR finished otherwise?

Not sure what the CI failures are about, those should be fast addressed :) Got sidelined by a whole lot of other tasks, will get back to this Monday, I think its a couple of days work until ready to merge :)

Copy link
Collaborator

@JordanYates JordanYates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is still doing a bunch of things that are not coupled to each other:

  • Fixing a GPIO driver
  • Updating the implementation of existing power domains
  • Adding a new type of power domain
  • Multiple commits related to one particular sensor sample
  • Implementing PM on a random sensor
  • Changing power domain setups on one SoC

Not all of these need to be their own PR, but the core of this PR (based on the title) is changing where power domain logic runs. That can be done without any of the extra commits that have crept in here.

I don't think its that hard to split this up to be more manageable (and faster for the trivial aspects)

static int invoke_suspend(const struct device *dev)
{
pm_policy_device_power_lock_put(dev);
pm_device_domain_children_turn_off(dev);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this pattern is still adding a chunk of overhead to all PM calls.

Every suspend/resume cycle is iterating over every child node regardless of whether the device has any capability of acting as a power domain. This is particularly problematic for PM_DEVICE_FLAG_ISR_SAFE, which is running all of these operations in an ISR context.

Should this not be using PM_DEVICE_FLAG_PD in order to determine whether to run these functions in the first place?
This would also allow dropping the pm: device: pm_device_children_action_run: validate child domain commit entirely, as I can't see any way in which a device could be a child of a power domain without being on that power domain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the actual invocation of either the RESUME or SUSPEND action has this overhead, not every call to pm_device_runtime_get() or put() for example, so though it is more overhead than optimal, IMO its ok for now.

I agree that we should add flags and test them instead (PM_DEVICE_FLAG_PD needs to be refactored to work properly, only checks if compat == "power-domain" ATM). I think its outside the scope of this PR, if you disagree I can surely refactor it :)

Regarding dropping pm: device: pm_device_children_action_run: validate child domain, we actually test for it here

power-domains = <&test_reg_1>;
alternate-domain = <&test_reg_chained>;
and
/* reg_chained should be powered after being requested */
pm_device_runtime_get(reg_chained);
zassert_true(pm_device_is_powered(reg_chained), "");
zassert_false(pm_device_is_powered(dev), "");
/* dev is not on reg_chained but does reference it, should still be in OFF */
pm_device_state_get(dev, &state);
zassert_equal(PM_DEVICE_STATE_OFF, state, "");
pm_device_runtime_put(reg_chained);
so I can't drop it without changing the test suite, I think that deserves its own PR as well

if (dev->pm_base->action_cb(dev, PM_DEVICE_ACTION_SUSPEND) < 0) {
pm_device_domain_children_turn_on(dev);
pm_policy_device_power_lock_get(dev);
return -EAGAIN;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also overrides return values from the drivers, which was not previously the case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return value now returned

@bjarki-andreasen
Copy link
Collaborator Author

bjarki-andreasen commented May 29, 2025

@JordanYates some parts can be moved to their own PRs sure, but this PR is reworking power domains, which means devices relying on previous power domain logic need to be updated to not introduce regressions with this PR to then be fixed by a later PR, that's usually how refactors are done.

@JordanYates
Copy link
Collaborator

which means devices relying on previous power domain logic need to be updated

I'm not complaining about the commits that update power domain drivers, they are obviously fine.
Its the assortment of other changes, some of which are the cause of CI failures.
Other parts (e.g. the PM policy changes) are not obvious that they are even changed in this PR from the title or description, and people that care about PM policy may have simply skipped the PR as not relevant to them.

@bjarki-andreasen
Copy link
Collaborator Author

which means devices relying on previous power domain logic need to be updated

I'm not complaining about the commits that update power domain drivers, they are obviously fine. Its the assortment of other changes, some of which are the cause of CI failures. Other parts (e.g. the PM policy changes) are not obvious that they are even changed in this PR from the title or description, and people that care about PM policy may have simply skipped the PR as not relevant to them.

I get your point, I will create new PRs for those issues and cherry-pick them here :)

The power_domain_gpio driver was reimplementing the behavior of
pm_device_children_action_run(). Replace reimplementation with
pm_device_children_action_run().

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
The gpio_smartbond driver incorrectly uses the static
PM_DEVICE_DEFINE() and PM_DEVICE_GET() macros when creating a driver
instance from a devicetree instance number.

Update to use PM_DEVICE_DT_INST_DEFINE() and
PM_DEVICE_DT_INST_GET() macros.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Validate that device dep children indeed belong to domain when
iterating dependent devices from pm_device_children_action_run().

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Introduce helpers pm_device_domain_children_turn_on() and
pm_device_domain_children_turn_off() to be called from
PM_DEVICE_RUNTIME when after resuming and before suspending a
power domain.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
The APIs pm_policy_device_power_lock_get and
pm_policy_device_power_lock_put are not mocked properly as they
should be excluded based on CONFIG_PM_POLICY_DEVICE_CONSTRAINTS,
not CONFIG_PM.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
@bjarki-andreasen bjarki-andreasen force-pushed the pm-power-domain-refactor branch from 0fa6876 to f3b4976 Compare June 2, 2025 13:59
@bjarki-andreasen
Copy link
Collaborator Author

Move device power domain and pm policy device logic to
PM_DEVICE_RUNTIME. Specifically:

- When resuming a device, if successful, get pm policy power
  lock for device, and notify any device power domain children
  of the new state of the device by calling TURN_ON.
- When suspending a device, first notify power domain children
  that device will be suspended by calling TURN_OFF, then put
  pm policy device lock, then try to suspend the device. If
  failure, notify children that device will remain resumed,
  and get pm policy device lock again.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
PM_DEVICE_RUNTIME now implements power domain child actions. Remove
this duplicate logic from the device power domain driver.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
PM_DEVICE_RUNTIME now implements power domain child actions. Remove
this duplicate logic from the device power domain driver.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Add default power domain driver which is useful for locking system
sleep states based on a group of devices.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Use the default power domain driver instead of creating a bespoke
one within the test suite.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Add support for pm device runtime to sample. Simply resume the
temperature device before using it. If PM_DEVICE_RUNTIME is not
enabled, device is assumed resumed on boot.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Implement pm device runtime for p3t1755 sensor.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
@bjarki-andreasen bjarki-andreasen force-pushed the pm-power-domain-refactor branch from f3b4976 to f8e07b0 Compare June 2, 2025 14:16
Update board to use default power domain for the "peripheral"
domain, which when resumed prevents SoC from entering suspend and
standby states as these power down peripherals.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Enable PM_DEVICE_RUNTIME and explicitlly enable dependencies
required for the power domains and device drivers power management
to function. For now, this requires manually enabling
PM_DEVICE_RUNTIME for every device used in the sample by adding the
devicetree prop zephyr,pm-device-runtime-auto to make the drivers
actually adhere to PM_DEVICE_RUNTIME...

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
pm_device_runtime has inconsistent behavior regarding
getting and putting a device's domain. This commit
aligns it to get the domain only once, once device
is resumed, and put only once, once device is
suspended.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
@bjarki-andreasen bjarki-andreasen force-pushed the pm-power-domain-refactor branch from f8e07b0 to 4019fa7 Compare June 2, 2025 14:33
PM_DEVICE_SYSTEM_MANAGED nor PM_DEVICE_RUNTIME is not supported by
test suite, so disable any config which results in either being
enabled.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
PM_DEVICE_SYSTEM_MANAGED nor PM_DEVICE_RUNTIME is not supported by
test suite, so disable any config which results in either being
enabled.

Signed-off-by: Bjarki Arge Andreasen <[email protected]>
Copy link

sonarqubecloud bot commented Jun 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants