diff mbox

[RFC,01/10] bcma: Use array to store cores.

Message ID 1307311658-15853-2-git-send-email-hauke@hauke-m.de (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Hauke Mehrtens June 5, 2011, 10:07 p.m. UTC
When using bcma on a embedded device it is initialized very early at
boot. We have to do so as the cpu and interrupt management and all
other devices are attached to this bus and it has to be initialized so
early. In that stage we can not allocate memory or sleep, just use the
memory on the stack and in the text segment as the kernel is not
initialized far enough. This patch removed the kzallocs from the scan
code. Some earlier version of the bcma implementation and the normal
ssb implementation are doing it like this.
The __bcma_dev_wrapper struct is used as the container for the device
struct as bcma_device will be too big if it includes struct device.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/bcma/main.c       |   86 ++++++++++++++++++++++++++++----------------
 drivers/bcma/scan.c       |   58 +++++++++++-------------------
 include/linux/bcma/bcma.h |   16 ++++++--
 3 files changed, 89 insertions(+), 71 deletions(-)

Comments

Arend van Spriel June 6, 2011, 8:31 a.m. UTC | #1
On 06/06/2011 12:07 AM, Hauke Mehrtens wrote:
> When using bcma on a embedded device it is initialized very early at
> boot. We have to do so as the cpu and interrupt management and all
> other devices are attached to this bus and it has to be initialized so
> early. In that stage we can not allocate memory or sleep, just use the
> memory on the stack and in the text segment as the kernel is not
> initialized far enough. This patch removed the kzallocs from the scan
> code. Some earlier version of the bcma implementation and the normal
> ssb implementation are doing it like this.
> The __bcma_dev_wrapper struct is used as the container for the device
> struct as bcma_device will be too big if it includes struct device.

Does this prevent using list_for_each() and friends to be used on the 
device list? If so, could you consider a different approach. There were 
good reasons to get rid of the bcma_dev_wrapper struct if I recall 
discussions on the mailing list correctly. I also see tendency to use 
ssb solutions without considering alternatives. For this particular 
example, please consider adding a bcma_zalloc(), which does kzalloc for 
non-embedded platforms and returns array pointers for embedded platform. 
You could also consider this behavior for the embedded bus only.

Gr. AvS
Arend van Spriel June 6, 2011, 10:09 a.m. UTC | #2
On 06/06/2011 11:42 AM, Rafa? Mi?ecki wrote:
> Greg, Arnd: could you take a look at this patch, please?
>
> With proposed patch we are going back to this ugly array and wrappers hacks.
>
> I was really happy with our final solution, but it seems it's not
> doable for embedded systems...? Is there something better we can do
> about this?

I do agree with Rafa? that we should look for another alternative. I 
posted a suggestion earlier regarding this patch. Can anyone tell me 
whether that could prevent need for the array/wrapper hack.

Gr. AvS
Arnd Bergmann June 6, 2011, 11:32 a.m. UTC | #3
On Monday 06 June 2011, Rafa? Mi?ecki wrote:
> Greg, Arnd: could you take a look at this patch, please?
> 
> With proposed patch we are going back to this ugly array and wrappers hacks.
> 
> I was really happy with our final solution, but it seems it's not
> doable for embedded systems...? Is there something better we can do
> about this?
> 
> 2011/6/6 Hauke Mehrtens <hauke@hauke-m.de>:
> > When using bcma on a embedded device it is initialized very early at
> > boot. We have to do so as the cpu and interrupt management and all
> > other devices are attached to this bus and it has to be initialized so
> > early. In that stage we can not allocate memory or sleep, just use the
> > memory on the stack and in the text segment as the kernel is not
> > initialized far enough. This patch removed the kzallocs from the scan
> > code. Some earlier version of the bcma implementation and the normal
> > ssb implementation are doing it like this.
> > The __bcma_dev_wrapper struct is used as the container for the device
> > struct as bcma_device will be too big if it includes struct device.
> >
> > Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>

If you rely on device scan to find your CPUs and interrupt controllers,
you are screwed already, this won't work.

In that case, it's better to have a few "early" drivers, as few as
possible, that don't go through the bus scan at all but have their
own ways of bootstrapping themselves. I don't know what you mean by
"CPU management", but I can only assume that it's not doing that much,
and you can just put the register values into the device tree.

For an interrupt controller, it should be ok to have it initialized
late, as long as it's only responsible for the devices on the same
bus and not for instance for IPI interrupts. Just make sure that you
do the bus scan and the initialization of the IRQ driver before you
initialize any drivers that rely in on the interrupts to be working.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
George Kashperko June 6, 2011, 12:29 p.m. UTC | #4
Hi,

> On Monday 06 June 2011, Rafa? Mi?ecki wrote:
> > Greg, Arnd: could you take a look at this patch, please?
> > 
> > With proposed patch we are going back to this ugly array and wrappers hacks.
> > 
> > I was really happy with our final solution, but it seems it's not
> > doable for embedded systems...? Is there something better we can do
> > about this?
> > 
> > 2011/6/6 Hauke Mehrtens <hauke@hauke-m.de>:
> > > When using bcma on a embedded device it is initialized very early at
> > > boot. We have to do so as the cpu and interrupt management and all
> > > other devices are attached to this bus and it has to be initialized so
> > > early. In that stage we can not allocate memory or sleep, just use the
> > > memory on the stack and in the text segment as the kernel is not
> > > initialized far enough. This patch removed the kzallocs from the scan
> > > code. Some earlier version of the bcma implementation and the normal
> > > ssb implementation are doing it like this.
> > > The __bcma_dev_wrapper struct is used as the container for the device
> > > struct as bcma_device will be too big if it includes struct device.
> > >
> > > Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
> 
> If you rely on device scan to find your CPUs and interrupt controllers,
> you are screwed already, this won't work.
> 
> In that case, it's better to have a few "early" drivers, as few as
> possible, that don't go through the bus scan at all but have their
> own ways of bootstrapping themselves. I don't know what you mean by
> "CPU management", but I can only assume that it's not doing that much,
> and you can just put the register values into the device tree.
GPIOs, flash and UART could get initialized early without erom scanning
as Chipcommon seems always to be the #0 core on the amba interconnect.

> 
> For an interrupt controller, it should be ok to have it initialized
> late, as long as it's only responsible for the devices on the same
> bus and not for instance for IPI interrupts. Just make sure that you
> do the bus scan and the initialization of the IRQ driver before you
> initialize any drivers that rely in on the interrupts to be working.
Without proper timer init (which requires both the chipcommon and mips
cores knowledge) kernel will get hung somewhere inside calibrate_delay.
It could get addressed if get bus scan called in arch_init_irq or
plat_time_init - both are executed before calibrate_delay and with slab
available.

Have nice day,
George


--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann June 6, 2011, 1:03 p.m. UTC | #5
On Monday 06 June 2011, George Kashperko wrote:
> > For an interrupt controller, it should be ok to have it initialized
> > late, as long as it's only responsible for the devices on the same
> > bus and not for instance for IPI interrupts. Just make sure that you
> > do the bus scan and the initialization of the IRQ driver before you
> > initialize any drivers that rely in on the interrupts to be working.
>
> Without proper timer init (which requires both the chipcommon and mips
> cores knowledge) kernel will get hung somewhere inside calibrate_delay.
> It could get addressed if get bus scan called in arch_init_irq or
> plat_time_init - both are executed before calibrate_delay and with slab
> available.

Ok, so you need the interrupt controller to be working for the timer tick,
right? I think another option (if that's not what you mean already) would
be to have a simpler way to find a device on the bus that can be called
before doing a full scan.

Early drivers would then have to know what is there and call a function
like "bcma_find_device(BCMA_DEV_ID_IRQ)", while drivers that are not
required to be up just register a regular device driver with a probe
function that gets called after the bus scan creates device structures.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hauke Mehrtens June 6, 2011, 9:38 p.m. UTC | #6
On 06/06/2011 03:03 PM, Arnd Bergmann wrote:
> On Monday 06 June 2011, George Kashperko wrote:
>>> For an interrupt controller, it should be ok to have it initialized
>>> late, as long as it's only responsible for the devices on the same
>>> bus and not for instance for IPI interrupts. Just make sure that you
>>> do the bus scan and the initialization of the IRQ driver before you
>>> initialize any drivers that rely in on the interrupts to be working.
>>
>> Without proper timer init (which requires both the chipcommon and mips
>> cores knowledge) kernel will get hung somewhere inside calibrate_delay.
>> It could get addressed if get bus scan called in arch_init_irq or
>> plat_time_init - both are executed before calibrate_delay and with slab
>> available.
> 
> Ok, so you need the interrupt controller to be working for the timer tick,
> right? I think another option (if that's not what you mean already) would
> be to have a simpler way to find a device on the bus that can be called
> before doing a full scan.
> 
> Early drivers would then have to know what is there and call a function
> like "bcma_find_device(BCMA_DEV_ID_IRQ)", while drivers that are not
> required to be up just register a regular device driver with a probe
> function that gets called after the bus scan creates device structures.
> 
> 	Arnd
Accessing chip common should be possible without scanning the hole bus
as it is at the first position and initializing most things just needs
chip common. For initializing the interrupts scanning is needed as we do
not know where the mips core is located.

As we can not use kalloc on early boot we could use a function which
uses kalloc under normal conditions and when on early boot the
architecture code which starts the bcma code should also provide a
function which returns a pointer to some memory in its text segment to
use. We need space for 16 cores in the architecture code.

In addition bcma_bus_register(struct bcma_bus *bus) has to be divided
into two parts. The first part will scan the bus and initialize chip
common and mips core. The second part will initialize pci core and
register the devices in the system. When using this under normal
conditions they will be called directly after each other.

Hauke
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann June 6, 2011, 9:53 p.m. UTC | #7
On Monday 06 June 2011 23:38:50 Hauke Mehrtens wrote:
> Accessing chip common should be possible without scanning the hole bus
> as it is at the first position and initializing most things just needs
> chip common. For initializing the interrupts scanning is needed as we do
> not know where the mips core is located.
> 
> As we can not use kalloc on early boot we could use a function which
> uses kalloc under normal conditions and when on early boot the
> architecture code which starts the bcma code should also provide a
> function which returns a pointer to some memory in its text segment to
> use. We need space for 16 cores in the architecture code.
>
> In addition bcma_bus_register(struct bcma_bus *bus) has to be divided
> into two parts. The first part will scan the bus and initialize chip
> common and mips core. The second part will initialize pci core and
> register the devices in the system. When using this under normal
> conditions they will be called directly after each other.

Just split out the minimal low-level function from the bcma_bus_scan
then, to locate a single device based on some identifier. The
bcma_bus_scan() function can then repeatedly allocate one device
and pass it to the low-level function when doing the proper scan,
while the arch code calls the low-level function directly with static
data.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arend van Spriel June 7, 2011, 10:12 a.m. UTC | #8
On 06/06/2011 11:53 PM, Arnd Bergmann wrote:
> On Monday 06 June 2011 23:38:50 Hauke Mehrtens wrote:
>> Accessing chip common should be possible without scanning the hole bus
>> as it is at the first position and initializing most things just needs
>> chip common. For initializing the interrupts scanning is needed as we do
>> not know where the mips core is located.
>>
>> As we can not use kalloc on early boot we could use a function which
>> uses kalloc under normal conditions and when on early boot the
>> architecture code which starts the bcma code should also provide a
>> function which returns a pointer to some memory in its text segment to
>> use. We need space for 16 cores in the architecture code.
>>
>> In addition bcma_bus_register(struct bcma_bus *bus) has to be divided
>> into two parts. The first part will scan the bus and initialize chip
>> common and mips core. The second part will initialize pci core and
>> register the devices in the system. When using this under normal
>> conditions they will be called directly after each other.
> Just split out the minimal low-level function from the bcma_bus_scan
> then, to locate a single device based on some identifier. The
> bcma_bus_scan() function can then repeatedly allocate one device
> and pass it to the low-level function when doing the proper scan,
> while the arch code calls the low-level function directly with static
> data.

If going for this we should pass struct bcma_device_id as match 
parameter as that identifies the core appropriately although you 
probably only want to match manufacturer and core identifiers.

Gr. AvS
Hauke Mehrtens June 7, 2011, 9:44 p.m. UTC | #9
On 06/07/2011 12:12 PM, Arend van Spriel wrote:
> On 06/06/2011 11:53 PM, Arnd Bergmann wrote:
>> On Monday 06 June 2011 23:38:50 Hauke Mehrtens wrote:
>>> Accessing chip common should be possible without scanning the hole bus
>>> as it is at the first position and initializing most things just needs
>>> chip common. For initializing the interrupts scanning is needed as we do
>>> not know where the mips core is located.
>>>
>>> As we can not use kalloc on early boot we could use a function which
>>> uses kalloc under normal conditions and when on early boot the
>>> architecture code which starts the bcma code should also provide a
>>> function which returns a pointer to some memory in its text segment to
>>> use. We need space for 16 cores in the architecture code.
>>>
>>> In addition bcma_bus_register(struct bcma_bus *bus) has to be divided
>>> into two parts. The first part will scan the bus and initialize chip
>>> common and mips core. The second part will initialize pci core and
>>> register the devices in the system. When using this under normal
>>> conditions they will be called directly after each other.
>> Just split out the minimal low-level function from the bcma_bus_scan
>> then, to locate a single device based on some identifier. The
>> bcma_bus_scan() function can then repeatedly allocate one device
>> and pass it to the low-level function when doing the proper scan,
>> while the arch code calls the low-level function directly with static
>> data.
> 
> If going for this we should pass struct bcma_device_id as match
> parameter as that identifies the core appropriately although you
> probably only want to match manufacturer and core identifiers.
> 
> Gr. AvS
> 

What is the problem with scanning the full bus? Scanning in general
works for embedded devices, just allocating memory with kalloc does not
work at that time, but the architecture code (something in
arch/mips/bcm47xx/) could provide some memory to store the struct
bcma_core, like it does for struct bcma_bus. We could just provide
memory for chipcommon and mips core or memory for all possible 16 cores,
the maximum number, as most embedded devices have ~9 cores providing
memory for 16 cores is not a big vast of memory and then we could use
the normal scan function.

A special scan function would just skip the wrong cores so I do not see
any advantage in that.

We could build a scan function which searches for one core and uses a
struct bcma_core stored on the stack and returns the struct bcma_core if
it found the wanted one. Then we could search for chipcommon and mips
and store then in arch code in arch/mips/bcm47xx and use them. When boot
is ready and we are searching the complete bus there is probably
something differences in the init process from normal init as we already
initialized chipcommon sometime earlier. I Would prefer to scan the bus
completely and initialize chipcommon and mips in early boot.

Hauke
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafał Miłecki June 8, 2011, 12:06 a.m. UTC | #10
2011/6/7 Hauke Mehrtens <hauke@hauke-m.de>:
> On 06/07/2011 12:12 PM, Arend van Spriel wrote:
>> On 06/06/2011 11:53 PM, Arnd Bergmann wrote:
>>> On Monday 06 June 2011 23:38:50 Hauke Mehrtens wrote:
>>>> Accessing chip common should be possible without scanning the hole bus
>>>> as it is at the first position and initializing most things just needs
>>>> chip common. For initializing the interrupts scanning is needed as we do
>>>> not know where the mips core is located.
>>>>
>>>> As we can not use kalloc on early boot we could use a function which
>>>> uses kalloc under normal conditions and when on early boot the
>>>> architecture code which starts the bcma code should also provide a
>>>> function which returns a pointer to some memory in its text segment to
>>>> use. We need space for 16 cores in the architecture code.
>>>>
>>>> In addition bcma_bus_register(struct bcma_bus *bus) has to be divided
>>>> into two parts. The first part will scan the bus and initialize chip
>>>> common and mips core. The second part will initialize pci core and
>>>> register the devices in the system. When using this under normal
>>>> conditions they will be called directly after each other.
>>> Just split out the minimal low-level function from the bcma_bus_scan
>>> then, to locate a single device based on some identifier. The
>>> bcma_bus_scan() function can then repeatedly allocate one device
>>> and pass it to the low-level function when doing the proper scan,
>>> while the arch code calls the low-level function directly with static
>>> data.
>>
>> If going for this we should pass struct bcma_device_id as match
>> parameter as that identifies the core appropriately although you
>> probably only want to match manufacturer and core identifiers.
>>
>> Gr. AvS
>>
>
> What is the problem with scanning the full bus?

Because full scanning needs one of the following:
1) Working alloc - not possible for SoCs
2) Hacks with wrappers, static cores info, lack of optimization (list)


> A special scan function would just skip the wrong cores so I do not see
> any advantage in that.
>
> We could build a scan function which searches for one core and uses a
> struct bcma_core stored on the stack and returns the struct bcma_core if
> it found the wanted one.

Yeah, this should be quite easy.

struct bcma_device core = bcma_early_find_core(bus, CC);
bcma_cc_init(core);


> Then we could search for chipcommon and mips
> and store then in arch code in arch/mips/bcm47xx and use them.

Not sure about this one. You have drivers for chipcommon and mips as
part of bcma. Do you need to involve arch/mips/bcm47xx to this?


> When boot
> is ready and we are searching the complete bus there is probably
> something differences in the init process from normal init as we already
> initialized chipcommon sometime earlier.

Nothing hard to handle.


> I Would prefer to scan the bus
> completely and initialize chipcommon and mips in early boot.

Really, I've nothing against scanning and splitting init into "early"
and "late". It's going back to static fields and wrappers that I don't
like :(
Michael Büsch June 8, 2011, 8:20 a.m. UTC | #11
On Wed, 8 Jun 2011 02:06:11 +0200
Rafa? Mi?ecki <zajec5@gmail.com> wrote:

> Because full scanning needs one of the following:
> 1) Working alloc - not possible for SoCs

Isn't there a bootmem allocator available on MIPS?
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hauke Mehrtens June 11, 2011, 10:33 p.m. UTC | #12
On 06/08/2011 10:20 AM, Michael Büsch wrote:
> On Wed, 8 Jun 2011 02:06:11 +0200
> Rafa? Mi?ecki <zajec5@gmail.com> wrote:
> 
>> Because full scanning needs one of the following:
>> 1) Working alloc - not possible for SoCs
> 
> Isn't there a bootmem allocator available on MIPS?

The bootmem allocator is working on mips, but it is initialized after
plat_mem_setup() was called. To use it we have to move the start of bcma
to some other place in the bcm47xx code.

We need access to the common and mips core for different functions in
the bcm47xx code and these functions are getting called by the mips
code, so we can not store these struct bcma_devices on the stack.

I would use this struct on the embedded device and use it in the text
segment of the bcm47xx code.

In include/linux/bcma/bcma.h:
struct bcma_soc {
	struct bcma_bus bus;
	struct bcma_device core_cc;
	struct bcma_device core_mips;
};

In arch/mips/bcm47xx/setup.c
struct bcma_soc bus;

The chipcommon and mips core can be initilized early without the need
use of any alloc. The bcm47xx code will call
bcma_bus_early_register(struct bcma_soc *soc) and this code will find
these two cores, add then to the list of cores in bcma_bus and run the
init code for them. After that we have all we need to boot up the
device. After the kernel page allocator is fully set up we would search
for all the other cores and add them to the list of cores and do the
initialization for them. The two cores in struct bcma_soc will never be
accessed directly but only through struct bcma_bus so that there is no
difference from early boot and normal mode.

Hauke
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/bcma/main.c b/drivers/bcma/main.c
index a2f6b18..b0e7f5e 100644
--- a/drivers/bcma/main.c
+++ b/drivers/bcma/main.c
@@ -17,23 +17,27 @@  static int bcma_device_remove(struct device *dev);
 
 static ssize_t manuf_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
-	return sprintf(buf, "0x%03X\n", core->id.manuf);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	return sprintf(buf, "0x%03X\n", wrapper->core->id.manuf);
 }
 static ssize_t id_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
-	return sprintf(buf, "0x%03X\n", core->id.id);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	return sprintf(buf, "0x%03X\n", wrapper->core->id.id);
 }
 static ssize_t rev_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
-	return sprintf(buf, "0x%02X\n", core->id.rev);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	return sprintf(buf, "0x%02X\n", wrapper->core->id.rev);
 }
 static ssize_t class_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
-	return sprintf(buf, "0x%X\n", core->id.class);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	return sprintf(buf, "0x%X\n", wrapper->core->id.class);
 }
 static struct device_attribute bcma_device_attrs[] = {
 	__ATTR_RO(manuf),
@@ -53,27 +57,30 @@  static struct bus_type bcma_bus_type = {
 
 static struct bcma_device *bcma_find_core(struct bcma_bus *bus, u16 coreid)
 {
-	struct bcma_device *core;
-
-	list_for_each_entry(core, &bus->cores, list) {
-		if (core->id.id == coreid)
-			return core;
+	u8 i;
+	for (i = 0; i < bus->nr_cores; i++) {
+		if (bus->cores[i].id.id == coreid)
+			return &bus->cores[i];
 	}
 	return NULL;
 }
 
 static void bcma_release_core_dev(struct device *dev)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
-	kfree(core);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	kfree(wrapper);
 }
 
 static int bcma_register_cores(struct bcma_bus *bus)
 {
 	struct bcma_device *core;
-	int err, dev_id = 0;
+	struct __bcma_dev_wrapper *wrapper;
+	int i, err, dev_id = 0;
+
+	for (i = 0; i < bus->nr_cores; i++) {
+		core = &(bus->cores[i]);
 
-	list_for_each_entry(core, &bus->cores, list) {
 		/* We support that cores ourself */
 		switch (core->id.id) {
 		case BCMA_CORE_CHIPCOMMON:
@@ -82,28 +89,37 @@  static int bcma_register_cores(struct bcma_bus *bus)
 			continue;
 		}
 
-		core->dev.release = bcma_release_core_dev;
-		core->dev.bus = &bcma_bus_type;
-		dev_set_name(&core->dev, "bcma%d:%d", 0/*bus->num*/, dev_id);
+		wrapper = kzalloc(sizeof(*wrapper), GFP_KERNEL);
+		if (!wrapper) {
+			pr_err("Could not allocate wrapper for core 0x%03X\n",
+			       core->id.id);
+			continue;
+		}
+
+		wrapper->core = core;
+		wrapper->dev.release = bcma_release_core_dev;
+		wrapper->dev.bus = &bcma_bus_type;
+		dev_set_name(&wrapper->dev, "bcma%d:%d", 0/*bus->num*/, dev_id);
 
 		switch (bus->hosttype) {
 		case BCMA_HOSTTYPE_PCI:
-			core->dev.parent = &bus->host_pci->dev;
-			core->dma_dev = &bus->host_pci->dev;
-			core->irq = bus->host_pci->irq;
+			wrapper->dev.parent = &bus->host_pci->dev;
+			wrapper->core->dma_dev = &bus->host_pci->dev;
+			wrapper->core->irq = bus->host_pci->irq;
 			break;
 		case BCMA_HOSTTYPE_NONE:
 		case BCMA_HOSTTYPE_SDIO:
 			break;
 		}
 
-		err = device_register(&core->dev);
+		err = device_register(&wrapper->dev);
 		if (err) {
 			pr_err("Could not register dev for core 0x%03X\n",
 			       core->id.id);
+			kfree(wrapper);
 			continue;
 		}
-		core->dev_registered = true;
+		core->dev = &wrapper->dev;
 		dev_id++;
 	}
 
@@ -113,10 +129,12 @@  static int bcma_register_cores(struct bcma_bus *bus)
 static void bcma_unregister_cores(struct bcma_bus *bus)
 {
 	struct bcma_device *core;
+	int i;
 
-	list_for_each_entry(core, &bus->cores, list) {
-		if (core->dev_registered)
-			device_unregister(&core->dev);
+	for (i = 0; i < bus->nr_cores; i++) {
+		core = &(bus->cores[i]);
+		if (core->dev)
+			device_unregister(core->dev);
 	}
 }
 
@@ -179,7 +197,9 @@  EXPORT_SYMBOL_GPL(bcma_driver_unregister);
 
 static int bcma_bus_match(struct device *dev, struct device_driver *drv)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	struct bcma_device *core = wrapper->core;
 	struct bcma_driver *adrv = container_of(drv, struct bcma_driver, drv);
 	const struct bcma_device_id *cid = &core->id;
 	const struct bcma_device_id *did;
@@ -196,7 +216,9 @@  static int bcma_bus_match(struct device *dev, struct device_driver *drv)
 
 static int bcma_device_probe(struct device *dev)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	struct bcma_device *core = wrapper->core;
 	struct bcma_driver *adrv = container_of(dev->driver, struct bcma_driver,
 					       drv);
 	int err = 0;
@@ -209,7 +231,9 @@  static int bcma_device_probe(struct device *dev)
 
 static int bcma_device_remove(struct device *dev)
 {
-	struct bcma_device *core = container_of(dev, struct bcma_device, dev);
+	struct __bcma_dev_wrapper *wrapper = container_of(dev,
+						struct __bcma_dev_wrapper, dev);
+	struct bcma_device *core = wrapper->core;
 	struct bcma_driver *adrv = container_of(dev->driver, struct bcma_driver,
 					       drv);
 
diff --git a/drivers/bcma/scan.c b/drivers/bcma/scan.c
index 40d7dcc..70b39f7 100644
--- a/drivers/bcma/scan.c
+++ b/drivers/bcma/scan.c
@@ -211,9 +211,6 @@  int bcma_bus_scan(struct bcma_bus *bus)
 	s32 tmp;
 	u8 i, j;
 
-	int err;
-
-	INIT_LIST_HEAD(&bus->cores);
 	bus->nr_cores = 0;
 
 	bcma_scan_switch_core(bus, BCMA_ADDR_BASE);
@@ -230,11 +227,8 @@  int bcma_bus_scan(struct bcma_bus *bus)
 	bcma_scan_switch_core(bus, erombase);
 
 	while (eromptr < eromend) {
-		struct bcma_device *core = kzalloc(sizeof(*core), GFP_KERNEL);
-		if (!core)
-			return -ENOMEM;
-		INIT_LIST_HEAD(&core->list);
-		core->bus = bus;
+		struct bcma_device core = { };
+		core.bus = bus;
 
 		/* get CIs */
 		cia = bcma_erom_get_ci(bus, &eromptr);
@@ -242,27 +236,24 @@  int bcma_bus_scan(struct bcma_bus *bus)
 			bcma_erom_push_ent(&eromptr);
 			if (bcma_erom_is_end(bus, &eromptr))
 				break;
-			err= -EILSEQ;
-			goto out;
+			return -EILSEQ;
 		}
 		cib = bcma_erom_get_ci(bus, &eromptr);
-		if (cib < 0) {
-			err= -EILSEQ;
-			goto out;
-		}
+		if (cib < 0)
+			return -EILSEQ;
 
 		/* parse CIs */
-		core->id.class = (cia & SCAN_CIA_CLASS) >> SCAN_CIA_CLASS_SHIFT;
-		core->id.id = (cia & SCAN_CIA_ID) >> SCAN_CIA_ID_SHIFT;
-		core->id.manuf = (cia & SCAN_CIA_MANUF) >> SCAN_CIA_MANUF_SHIFT;
+		core.id.class = (cia & SCAN_CIA_CLASS) >> SCAN_CIA_CLASS_SHIFT;
+		core.id.id = (cia & SCAN_CIA_ID) >> SCAN_CIA_ID_SHIFT;
+		core.id.manuf = (cia & SCAN_CIA_MANUF) >> SCAN_CIA_MANUF_SHIFT;
 		ports[0] = (cib & SCAN_CIB_NMP) >> SCAN_CIB_NMP_SHIFT;
 		ports[1] = (cib & SCAN_CIB_NSP) >> SCAN_CIB_NSP_SHIFT;
 		wrappers[0] = (cib & SCAN_CIB_NMW) >> SCAN_CIB_NMW_SHIFT;
 		wrappers[1] = (cib & SCAN_CIB_NSW) >> SCAN_CIB_NSW_SHIFT;
-		core->id.rev = (cib & SCAN_CIB_REV) >> SCAN_CIB_REV_SHIFT;
+		core.id.rev = (cib & SCAN_CIB_REV) >> SCAN_CIB_REV_SHIFT;
 
-		if (((core->id.manuf == BCMA_MANUF_ARM) &&
-		     (core->id.id == 0xFFF)) ||
+		if (((core.id.manuf == BCMA_MANUF_ARM) &&
+		     (core.id.id == 0xFFF)) ||
 		    (ports[1] == 0)) {
 			bcma_erom_skip_component(bus, &eromptr);
 			continue;
@@ -285,10 +276,8 @@  int bcma_bus_scan(struct bcma_bus *bus)
 		/* get & parse master ports */
 		for (i = 0; i < ports[0]; i++) {
 			u32 mst_port_d = bcma_erom_get_mst_port(bus, &eromptr);
-			if (mst_port_d < 0) {
-				err= -EILSEQ;
-				goto out;
-			}
+			if (mst_port_d < 0)
+				return -EILSEQ;
 		}
 
 		/* get & parse slave ports */
@@ -303,7 +292,7 @@  int bcma_bus_scan(struct bcma_bus *bus)
 					break;
 				} else {
 					if (i == 0 && j == 0)
-						core->addr = tmp;
+						core.addr = tmp;
 				}
 			}
 		}
@@ -320,7 +309,7 @@  int bcma_bus_scan(struct bcma_bus *bus)
 					break;
 				} else {
 					if (i == 0 && j == 0)
-						core->wrap = tmp;
+						core.wrap = tmp;
 				}
 			}
 		}
@@ -338,22 +327,19 @@  int bcma_bus_scan(struct bcma_bus *bus)
 					break;
 				} else {
 					if (wrappers[0] == 0 && !i && !j)
-						core->wrap = tmp;
+						core.wrap = tmp;
 				}
 			}
 		}
 
 		pr_info("Core %d found: %s "
 			"(manuf 0x%03X, id 0x%03X, rev 0x%02X, class 0x%X)\n",
-			bus->nr_cores, bcma_device_name(&core->id),
-			core->id.manuf, core->id.id, core->id.rev,
-			core->id.class);
-
-		core->core_index = bus->nr_cores++;
-		list_add(&core->list, &bus->cores);
-		continue;
-out:
-		return err;
+			bus->nr_cores, bcma_device_name(&core.id),
+			core.id.manuf, core.id.id, core.id.rev,
+			core.id.class);
+
+		core.core_index = bus->nr_cores;
+		bus->cores[bus->nr_cores++] = core;
 	}
 
 	return 0;
diff --git a/include/linux/bcma/bcma.h b/include/linux/bcma/bcma.h
index 27a27a7..3dc5302 100644
--- a/include/linux/bcma/bcma.h
+++ b/include/linux/bcma/bcma.h
@@ -118,14 +118,23 @@  struct bcma_host_ops {
 
 #define BCMA_MAX_NR_CORES		16
 
+/* 1) It is not allowed to put struct device statically in bcma_device
+ * 2) We can not just use pointer to struct device because we use container_of
+ * 3) We do not have pointer to struct bcma_device in struct device
+ * Solution: use such a dummy wrapper
+ */
+struct __bcma_dev_wrapper {
+	struct device dev;
+	struct bcma_device *core;
+};
+
 struct bcma_device {
 	struct bcma_bus *bus;
 	struct bcma_device_id id;
 
-	struct device dev;
+	struct device *dev;
 	struct device *dma_dev;
 	unsigned int irq;
-	bool dev_registered;
 
 	u8 core_index;
 
@@ -133,7 +142,6 @@  struct bcma_device {
 	u32 wrap;
 
 	void *drvdata;
-	struct list_head list;
 };
 
 static inline void *bcma_get_drvdata(struct bcma_device *core)
@@ -182,7 +190,7 @@  struct bcma_bus {
 	struct bcma_chipinfo chipinfo;
 
 	struct bcma_device *mapped_core;
-	struct list_head cores;
+	struct bcma_device cores[BCMA_MAX_NR_CORES];
 	u8 nr_cores;
 
 	struct bcma_drv_cc drv_cc;