Skip to content

Commit d7fefcc

Browse files
kvaneeshtorvalds
authored andcommitted
mm/cma: add PF flag to force non cma alloc
Patch series "mm/kvm/vfio/ppc64: Migrate compound pages out of CMA region", v8. ppc64 uses the CMA area for the allocation of guest page table (hash page table). We won't be able to start guest if we fail to allocate hash page table. We have observed hash table allocation failure because we failed to migrate pages out of CMA region because they were pinned. This happen when we are using VFIO. VFIO on ppc64 pins the entire guest RAM. If the guest RAM pages get allocated out of CMA region, we won't be able to migrate those pages. The pages are also pinned for the lifetime of the guest. Currently we support migration of non-compound pages. With THP and with the addition of hugetlb migration we can end up allocating compound pages from CMA region. This patch series add support for migrating compound pages. This patch (of 4): Add PF_MEMALLOC_NOCMA which make sure any allocation in that context is marked non-movable and hence cannot be satisfied by CMA region. This is useful with get_user_pages_longterm where we want to take a page pin by migrating pages from CMA region. Marking the section PF_MEMALLOC_NOCMA ensures that we avoid unnecessary page migration later. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Suggested-by: Andrea Arcangeli <[email protected]> Reviewed-by: Andrea Arcangeli <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Alexey Kardashevskiy <[email protected]> Cc: David Gibson <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 6e2e07c commit d7fefcc

File tree

2 files changed

+41
-8
lines changed

2 files changed

+41
-8
lines changed

include/linux/sched.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1407,6 +1407,7 @@ extern struct pid *cad_pid;
14071407
#define PF_UMH 0x02000000 /* I'm an Usermodehelper process */
14081408
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */
14091409
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
1410+
#define PF_MEMALLOC_NOCMA 0x10000000 /* All allocation request will have _GFP_MOVABLE cleared */
14101411
#define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */
14111412
#define PF_FREEZER_SKIP 0x40000000 /* Freezer should not count it as freezable */
14121413
#define PF_SUSPEND_TASK 0x80000000 /* This thread called freeze_processes() and should not be frozen */

include/linux/sched/mm.h

Lines changed: 40 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -148,17 +148,25 @@ static inline bool in_vfork(struct task_struct *tsk)
148148
* Applies per-task gfp context to the given allocation flags.
149149
* PF_MEMALLOC_NOIO implies GFP_NOIO
150150
* PF_MEMALLOC_NOFS implies GFP_NOFS
151+
* PF_MEMALLOC_NOCMA implies no allocation from CMA region.
151152
*/
152153
static inline gfp_t current_gfp_context(gfp_t flags)
153154
{
154-
/*
155-
* NOIO implies both NOIO and NOFS and it is a weaker context
156-
* so always make sure it makes precedence
157-
*/
158-
if (unlikely(current->flags & PF_MEMALLOC_NOIO))
159-
flags &= ~(__GFP_IO | __GFP_FS);
160-
else if (unlikely(current->flags & PF_MEMALLOC_NOFS))
161-
flags &= ~__GFP_FS;
155+
if (unlikely(current->flags &
156+
(PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
157+
/*
158+
* NOIO implies both NOIO and NOFS and it is a weaker context
159+
* so always make sure it makes precedence
160+
*/
161+
if (current->flags & PF_MEMALLOC_NOIO)
162+
flags &= ~(__GFP_IO | __GFP_FS);
163+
else if (current->flags & PF_MEMALLOC_NOFS)
164+
flags &= ~__GFP_FS;
165+
#ifdef CONFIG_CMA
166+
if (current->flags & PF_MEMALLOC_NOCMA)
167+
flags &= ~__GFP_MOVABLE;
168+
#endif
169+
}
162170
return flags;
163171
}
164172

@@ -248,6 +256,30 @@ static inline void memalloc_noreclaim_restore(unsigned int flags)
248256
current->flags = (current->flags & ~PF_MEMALLOC) | flags;
249257
}
250258

259+
#ifdef CONFIG_CMA
260+
static inline unsigned int memalloc_nocma_save(void)
261+
{
262+
unsigned int flags = current->flags & PF_MEMALLOC_NOCMA;
263+
264+
current->flags |= PF_MEMALLOC_NOCMA;
265+
return flags;
266+
}
267+
268+
static inline void memalloc_nocma_restore(unsigned int flags)
269+
{
270+
current->flags = (current->flags & ~PF_MEMALLOC_NOCMA) | flags;
271+
}
272+
#else
273+
static inline unsigned int memalloc_nocma_save(void)
274+
{
275+
return 0;
276+
}
277+
278+
static inline void memalloc_nocma_restore(unsigned int flags)
279+
{
280+
}
281+
#endif
282+
251283
#ifdef CONFIG_MEMCG
252284
/**
253285
* memalloc_use_memcg - Starts the remote memcg charging scope.

0 commit comments

Comments
 (0)